<aside> 💡 “All models are wrong, but some are useful.” - George E.P. Box

</aside>

Create a very basic model which should serve as the baseline for all the other complex machine learning models. Checklist of steps:

Train a few commonly used ML models like Naïve Bayes, linear regression, SVM, etc. using default parameters
Measure and compare the performance of each model with the baseline and with all the others
Employ N-fold cross-validation for each model and compute the mean and standard deviation of the performance metrics on the N folds
Study the features that have the most impact on the target
Analyze the types of errors the models make while predicting
Engineer the features in a different manner
Repeat the above steps a few times(trial and error) to be sure that we have used the right features in the right format
Shortlist the top models based on their performance measures

*click the image to enlarge:

Untitled

🏠

1