Models
<aside>
💡 “All models are wrong, but some are useful.” - George E.P. Box
</aside>
Create a very basic model which should serve as the baseline for all the other complex machine learning models. Checklist of steps:
- Train a few commonly used ML models like Naïve Bayes, linear regression, SVM, etc. using default parameters
- Measure and compare the performance of each model with the baseline and with all the others
- Employ N-fold cross-validation for each model and compute the mean and standard deviation of the performance metrics on the N folds
- Study the features that have the most impact on the target
- Analyze the types of errors the models make while predicting
- Engineer the features in a different manner
- Repeat the above steps a few times(trial and error) to be sure that we have used the right features in the right format
- Shortlist the top models based on their performance measures
*click the image to enlarge: