Machine Learning

Arthur Samuel, a pioneer in artificial intelligence, defined Machine Learning in 1959 as "the field of study that gives computers the ability to learn without being explicitly programmed."

A more formal definition of Machine Learning is provided by Prof Tom Mitchell of CMU:

"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E."

Consider the example of a Machine Learning algorithm that plays chess. In this example, E refers to the experience of playing chess, T is the task of playing chess, and P denotes the probability that the program will win the next game of chess.

Machine learning is similar to how a human being learns. For example if a human wants to learn how to play poker, they will firstly learn the rules. Then they will try to get experience by playing the game. This experience is nothing but a huge data set for a machine, which then it uses to make intelligent decisions regarding the proposed problem.

In general, machine learning problems can be classified into supervised learning and unsupervised learning. In supervised learning, you have the input and the labeled output, and you suspect that a relationship exists between the input and the labeled output. When you know neither what the labeled output is nor if a relationship exists, unsupervised learning will help you find structure in your data if there is one.

We've covered two main categories of machine learning, but there are four broad categories of machine learning:

  1. Supervised learning
  2. Unsupervised learning
  3. Semi-supervised Learning
  4. Reinforcement Learning

Supervised learning

Supervised learning is the machine learning task of inferring a function from supervised training data. The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called the supervisory signal). Further, the supervised learning can be taken as 2 paradigms, classification and regression.

Overfitting and Underfitting

The idea of overfitting in machine learning is making the machine model the data too well. This essentially making the machine to recognize a situation with only a specific characteristics that the data shows, or in other words picking up too much noise in the data. This can be problematic in terms of the lack of flexibility in adapting different situations. On the other hand, underfitting generalizes too much. It doesn't recognizes patterns as well and would not be able to differentiate much about different situations.

Basic flowchart/steps for supervised learning

  1. Collect your dataset and divide it into a training and test dataset.
  2. Divide training set into input object (features) and output object (classes or value). Check to see whether your target variable/output object is categorical (a variable that has a limited number of values related to names/labels) or numerical (a variable that has numbers as values).
  3. Decide what the type of algorithm you will be applying, which is dependent on whether your target variable is categorical, in which case you will use a classifier, or numerical, in which case you will use a regression algorithm.
  4. Next, within the classifier/regression algorithm categories, you will need to pick the algorithms (which is called a model in machine learning) that you would like to apply. You can also pick multiple models to evaluate which one is the best option for your dataset. Examples of models include linear/logistic regression, Support Vector Machines, Neural Networks, Decision Trees, etc.
  5. Run the algorithm(s) on your training set and evaluate their performance. The performance metrics also depend on whether you are using a classifier/regression algorithm; for example, metrics like Accuracy, Recall, and Precision are performance metrics for a classifier algorithm and metrics like Mean Squared Error and Root Mean Squared Error are performance metrics for a regression algorithm. This step also contains an important sub-step: -> Typically, machine learning practitioners and data scientists tune the model's hyperparameters (for example, the learning rate or the number of leaves in a decision tree) when evaluating the models' performance.
  6. After you've evaluated your models on the training dataset and picked the model or models with the best performance, you can use the model for predictions on the test dataset.


  1. Intro to Machine Learning
  2. Machine Learning - Taught by: Andrew Ng
  3. Data Science and Machine Learning with Python - Hands On!
  4. Machine Learning
  5. The Analytics Edge - Taught by: MIT
  6. Google's crash course and certification

Video Resources

  1. Siraj Raval's Youtube channel
  2. Sentdex's Youtube channel

More Information


Building Smart Apps with Azure Machine Learning Studio

Contributing to the Guide

This open source guide is curated by thousands of contributors. You can help by researching, writing and updating these articles. It is an easy and fun way to get started with contributing to open source.