Final Model Evaluation
Avocado dataset
predictions = pd.concat([rfe_score, XGBR_score, rr_score, rf_score, lm_score], ignore_index=True, sort=False)
predictions
Comparative Performance of Regression Models
|
Model |
R2 Score |
Adjusted R2 Score |
Cross Validated R2 Score |
RMSE |
| 0 |
Random Forest with RFE |
0.800169 |
0.797581 |
0.889159 |
0.180622 |
| 1 |
XGBoost |
0.798641 |
0.796034 |
0.911125 |
0.181311 |
| 2 |
Ridge Regression |
0.598733 |
0.593537 |
0.604317 |
0.255950 |
| 3 |
Random Forest |
0.787120 |
0.784363 |
0.876525 |
0.186426 |
| 4 |
Linear Regression |
0.598793 |
0.593598 |
0.604281 |
0.255931 |
Boston dataset
predictions2 = pd.concat([rfe_score2, XGBR_score2, rr_score2, rf_score2, lm_score2], ignore_index=True, sort=False)
predictions2
Regression Model Performance Metrics
|
Model |
R2 Score |
Adjusted R2 Score |
Cross Validated R2 Score |
RMSE |
| 0 |
Random Forest with RFE |
0.839377 |
0.824246 |
0.821140 |
3.459550 |
| 1 |
XGBoost |
0.901889 |
0.892646 |
0.845593 |
2.703810 |
| 2 |
Ridge Regression |
0.678696 |
0.648428 |
0.689293 |
4.892991 |
| 3 |
Random Forest |
0.838576 |
0.823369 |
0.817514 |
3.468169 |
| 4 |
Linear Regression |
0.679168 |
0.648945 |
0.687535 |
4.889394 |
Visualizing Model Performance
f, axe = plt.subplots(1,1, figsize=(18,6))
predictions.sort_values(by=['Cross Validated R2 Score'], ascending=False, inplace=True)
sns.barplot(x='Cross Validated R2 Score', y='Model', data = predictions, ax = axe)
axe.set_xlabel('Cross Validated R2 Score', size=16)
axe.set_ylabel('Model')
axe.set_xlim(0,1.0)
axe.set(title='Model Performance for Avocado dataset')
plt.show()
f, axe = plt.subplots(1,1, figsize=(18,6))
predictions2.sort_values(by=['Cross Validated R2 Score'], ascending=False, inplace=True)
sns.barplot(x='Cross Validated R2 Score', y='Model', data = predictions2, ax = axe)
axe.set_xlabel('Cross Validated R2 Score', size=16)
axe.set_ylabel('Model')
axe.set_xlim(0,1.0)
axe.set(title='Model Performance for Boston dataset')
plt.show()