Final performance comparison
Avocado data set
result = pd.concat([clf_score, predictions], ignore_index=True, sort=False)
result
Regression Model Performance Comparison
|
Model |
R2 Score |
Adjusted R2 Score |
Cross Validated R2 Score |
RMSE |
| 0 |
Tuned Ridge Regression |
0.736622 |
0.733212 |
0.739008 |
0.210438 |
| 1 |
XGBoost |
0.798641 |
0.796034 |
0.911125 |
0.181311 |
| 2 |
Random Forest with RFE |
0.800169 |
0.797581 |
0.889159 |
0.180622 |
| 3 |
Random Forest |
0.787120 |
0.784363 |
0.876525 |
0.186426 |
| 4 |
Ridge Regression |
0.598733 |
0.593537 |
0.604317 |
0.255950 |
| 5 |
Linear Regression |
0.598793 |
0.593598 |
0.604281 |
0.255931 |
f, axe = plt.subplots(1,1, figsize=(18,6))
result.sort_values(by=['Cross Validated R2 Score'], ascending=False, inplace=True)
sns.barplot(x='Cross Validated R2 Score', y='Model', data = result, ax = axe)
#axes[0].set(xlabel='Region', ylabel='Charges')
axe.set_xlabel('Cross Validated R2 Score', size=16)
axe.set_ylabel('Model')
axe.set_xlim(0,1.0)
axe.set(title='Model Performance for Avocado dataset')
plt.show()
Boston data set
result = pd.concat([clf_score2, predictions2], ignore_index=True, sort=False)
result
Regression Model Performance Metrics (Final Comparison)
|
Model |
R2 Score |
Adjusted R2 Score |
Cross Validated R2 Score |
RMSE |
| 0 |
Tuned Ridge Regression |
0.793267 |
0.773792 |
0.844628 |
3.965999 |
| 1 |
XGBoost |
0.901889 |
0.892646 |
0.845593 |
2.703810 |
| 2 |
Random Forest with RFE |
0.839377 |
0.824246 |
0.821140 |
3.459550 |
| 3 |
Random Forest |
0.838576 |
0.823369 |
0.817514 |
3.468169 |
| 4 |
Ridge Regression |
0.678696 |
0.648428 |
0.689293 |
4.892991 |
| 5 |
Linear Regression |
0.679168 |
0.648945 |
0.687535 |
4.889394 |
f, axe = plt.subplots(1,1, figsize=(18,6))
result.sort_values(by=['Cross Validated R2 Score'], ascending=False, inplace=True)
sns.barplot(x='Cross Validated R2 Score', y='Model', data = result, ax = axe)
#axes[0].set(xlabel='Region', ylabel='Charges')
axe.set_xlabel('Cross Validated R2 Score', size=16)
axe.set_ylabel('Model')
axe.set_xlim(0,1.0)
axe.set(title='Model Performance for Boston dataset')
plt.show()