Completion requirements
View
This is a book resource with multiple pages. Navigate between the pages using the
buttons.
Final Model Evaluation
Avocado dataset
predictions = pd.concat([rfe_score, XGBR_score, rr_score, rf_score, lm_score], ignore_index=True, sort=False) predictions
| Model | R2 Score | Adjusted R2 Score | Cross Validated R2 Score | RMSE | |
|---|---|---|---|---|---|
| 0 | Random Forest with RFE | 0.800169 | 0.797581 | 0.889159 | 0.180622 |
| 1 | XGBoost | 0.798641 | 0.796034 | 0.911125 | 0.181311 |
| 2 | Ridge Regression | 0.598733 | 0.593537 | 0.604317 | 0.255950 |
| 3 | Random Forest | 0.787120 | 0.784363 | 0.876525 | 0.186426 |
| 4 | Linear Regression | 0.598793 | 0.593598 | 0.604281 | 0.255931 |
Boston dataset
predictions2 = pd.concat([rfe_score2, XGBR_score2, rr_score2, rf_score2, lm_score2], ignore_index=True, sort=False) predictions2
| Model | R2 Score | Adjusted R2 Score | Cross Validated R2 Score | RMSE | |
|---|---|---|---|---|---|
| 0 | Random Forest with RFE | 0.839377 | 0.824246 | 0.821140 | 3.459550 |
| 1 | XGBoost | 0.901889 | 0.892646 | 0.845593 | 2.703810 |
| 2 | Ridge Regression | 0.678696 | 0.648428 | 0.689293 | 4.892991 |
| 3 | Random Forest | 0.838576 | 0.823369 | 0.817514 | 3.468169 |
| 4 | Linear Regression | 0.679168 | 0.648945 | 0.687535 | 4.889394 |
Visualizing Model Performance
f, axe = plt.subplots(1,1, figsize=(18,6))
predictions.sort_values(by=['Cross Validated R2 Score'], ascending=False, inplace=True)
sns.barplot(x='Cross Validated R2 Score', y='Model', data = predictions, ax = axe)
axe.set_xlabel('Cross Validated R2 Score', size=16)
axe.set_ylabel('Model')
axe.set_xlim(0,1.0)
axe.set(title='Model Performance for Avocado dataset')
plt.show()

f, axe = plt.subplots(1,1, figsize=(18,6))
predictions2.sort_values(by=['Cross Validated R2 Score'], ascending=False, inplace=True)
sns.barplot(x='Cross Validated R2 Score', y='Model', data = predictions2, ax = axe)
axe.set_xlabel('Cross Validated R2 Score', size=16)
axe.set_ylabel('Model')
axe.set_xlim(0,1.0)
axe.set(title='Model Performance for Boston dataset')
plt.show()
