How To Choose Machine Learning Algorithms

In this article, you will learn how to choose a machine learning algorithm.

Which machine learning algorithm you should use entirely depends upon the nature of data, its size, and quality. It depends on what we want as an output from data. Also, it depends on how much time we have. Even experienced data scientists cannot tell which algorithm will give the most accurate answers, before applying them.
 

Factors for choosing algorithms

 
Accuracy
 
At times, it is not necessary to get the most accurate answer, and getting an approximation is adequate, depending on what we want to achieve. This also helps in reducing the processing time and helps us to find the most efficient method. At the same time, the approximate method tends to avoid overfitting.
 
Training Time
 
Time spent to train a model depends a lot on the algorithm. Often training time highly impacts accuracy. Some algorithms are more sensitive to numbers of data points, which can affect the accuracy, especially when the dataset is huge.
 
Linearity
 
Many machine learning algorithms rely on a linearity property, which assumes that data points on a plotted graph can be separated by a straight line (or its higher dimensional analog). An example of this approach would be logistic regression, supporting vector machine.
 
According to the linear regression algorithm, it assumes that the data trend follows a straight line. This approach may work for some problems and may not work for others, eventually bringing down the accuracy.
 
How To Choose Machine Learning Algorithm
Figure Salary and Age have a linear relation
 
However, data with a non-linear trend would generate a larger error, but despite their dangers, a linear algorithm is very popular to get started as they are simple and fast to train.
 
Number of Parameters
 
A lot of algorithm’s behavior is affected by the number of parameters involved during training time, also training time and accuracy are also quite sensitive with respect to the parameters involved. In most of the cases, a dataset with a large number of parameters relies on the trial and error approach to find a good combination. The benefit of having many parameters indicates that an algorithm has greater flexibility, which can help to achieve good accuracy, provided we can figure out the best combination of parameters.
 

Azure Machine Learning Algorithm Cheat Sheet

 
You may use this sheet which may come very handy to decide which algorithm suits best your need.
 
How To Choose Machine Learning Algorithm