Work@Microsoft    Live@Seattle

ML101: How to Choose a Machine Learning Algorithm for Multi-class Classification Problems

Rate this post

In classification, the target variable is categorical and unordered..  To solve a multi-class classification problem, we typically choose one of the following supervised learning algorithms in machine learning.


AlgorithmAccuracyTraining TimeLinearityParametersAdditional Comments
Multiclass Logistic regression★★★★5
Multiclass Decision Forest★★★★☆6
Multiclass Decision Jungle★★★★☆6
Multiclass Neural Network★★9
One-vs-All MulticlassSee properties of the selected two-class classification algorithm


  • Accuracy: Getting the most accurate answer possible isn’t always necessary. Sometimes an approximation is adequate, depending on what you want to use it for. If that’s the case, you may be able to cut your processing time dramatically by sticking with more approximate methods.  Another advantage of more approximate methods is that they naturally tend to avoid overfitting.
  • Training Time: The number of minutes or hours necessary to train a model varies a great deal between algorithms. Training time is often closely tied to accuracy—one typically accompanies the other. In addition, some algorithms are more sensitive to the number of data points than others. When time is limited it can drive the choice of algorithm, especially when the data set is large.
  • Linearity: Lots of machine learning algorithms make use of linearity. Linear regression algorithms assume that data trends follow a straight line. These assumptions aren’t bad for some problems, but on others they bring accuracy down. Despite their dangers, linear algorithms are very popular as a first line of attack. They tend to be algorithmically simple and fast to train.
  • Parameters: Parameters are the knobs a data scientist gets to turn when setting up an algorithm. They are numbers that affect the algorithm’s behavior, such as error tolerance or number of iterations, or options between variants of how the algorithm behaves. The training time and accuracy of the algorithm can sometimes be quite sensitive to getting just the right settings. Typically, algorithms with large numbers parameters require the most trial and error to find a good combination. The upside is that having many parameters typically indicates that an algorithm has greater flexibility. It can often achieve very good accuracy. Provided you can find the right combination of parameter settings.

Comments to ML101: How to Choose a Machine Learning Algorithm for Multi-class Classification Problems

  • Very useful, Thanks. How did you come to conclusion, is there a dataset on which these algorithms tested ?

    AzmathMohamad September 25, 2016 9:14 am Reply

Leave a Comment

Your email address will not be published. Required fields are marked *