Machine Learning Algorithms - A Review

Machine Learning Algorithms -A Review

Technical Report · January 2019

DOI: 10.21275/ART20203995


Machine Learning Algorithms - A Review

Batta Mahesh

Abstract: Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform a
specific task without being explicitly programmed. Learning algorithms in many applications that’s we make use of daily. Every time a
web search engine like Google is used to search the internet, one of the reasons that work so well is because a learning algorithm that
has learned how to rank web pages. These algorithms are used for various purposes like data mining, image processing, predictive
analytics, etc. to name a few. The main advantage of using machine learning is that, once an algorithm learns what to do with data, it
can do its work automatically. In this paper, a brief review and future prospect of the vast applications of machine learning algorithms
has been made.

Keywords: Algorithm, Machine Learning, Pseudo Code, Supervised learning, Unsupervised learning, Reinforcement learning

1. Introduction famous for his checkers playing program. Machine learning

(ML) is used to teach machines how to handle the data more
Since their evolution, humans have been using many types efficiently. Sometimes after viewing the data, we cannot
of tools to accomplish various tasks in a simpler way. The interpret the extract information from the data. In that case,
creativity of the human brain led to the invention of different we apply machine learning. With the abundance of datasets
machines. These machines made the human life easy by available, the demand for machine learning is in rise. Many
enabling people to meet various life needs, including industries apply machine learning to extract relevant data.
travelling, industries, and computing. And Machine learning The purpose of machine learning is to learn from the data.
is the one among them. Many studies have been done on how to make machines
learn by themselves without being explicitly programmed.
According to Arthur Samuel Machine learning is defined as Many mathematicians and programmers apply several
the field of study that gives computers the ability to learn approaches to find the solution of this problem which are
without being explicitly programmed. Arthur Samuel was having huge data sets.

Machine Learning relies on different algorithms to solve learn some kind of patterns from the training dataset and
data problems. Data scientists like to point out that there‟s apply them to the test dataset for prediction or classification.
no single one-size-fits-all type of algorithm that is best to The workflow of supervised machine learning algorithms is
solve a problem. The kind of algorithm employed depends given in fig below. Most famous supervised machine
on the kind of problem you wish to solve, the number of learning algorithms have been discussed here
variables, the kind of model that would suit it best and so on.
Here‟s a quick look at some of the commonly used
algorithms in machine learning (ML)

Supervised Learning
Supervised learning is the machine learning task of learning
a function that maps an input to an output based on example
input-output pairs. It infers a function from labelled training
data consisting of a set of training examples. The supervised
machine learning algorithms are those algorithms which Figure: Supervised learning Workflow
needs external assistance. The input dataset is divided into
train and test dataset. The train dataset has output variable
which needs to be predicted or classified. All algorithms
Decision Tree
Decision tree is a graph to represent choices and their results
in form of a tree. The nodes in the graph represent an event
or choice and the edges of the graph represent the decision
rules or conditions. Each tree consists of nodes and
branches. Each node represents attributes in a group that is
to be classified and each branch represents a value that the
node can take.

Figure: Navie Bayes

Pseudo Code of Navie Bayes

Training dataset T,
F= (f1, f2, f3,.., fn) // value of the predictor variable in
testing dataset.
Output: A class of testing dataset.
1) Read the training dataset T;
Figure: Decision Tree 2) Calculate the mean and standard deviation of the
predictor variables in each class;
Decision Tree Pseudo Code: 3) Repeat Calculate the probability of fi using the gauss
def density equation in each class; Until the probability of all
decisionTreeLearning(examples, attributes, predictor variables (f1, f2, f3,.., fn) has been calculated.
parent_examples): 4) Calculate the likelihood for each class;
if len(examples) == 0: 5) Get the greatest likelihood
return pluralityValue(parent_examples)
# return most probable answer as there is no training data Support Vector Machine
left Another most widely used state-of-the-art machine learning
elif len(attributes) == 0: technique is Support Vector Machine (SVM). In machine
return pluralityValue(examples) learning, support-vector machines are supervised
elif (all examples classify the same): learning models with associated learning algorithms that
return their classification analyze data used for classification and regression analysis.
A = max(attributes, key(a)=importance(a, examples) In addition to performing linear classification, SVMs can
# choose the most promissing attribute to condition on efficiently perform a non-linear classification using what is
tree = new Tree(root=A) called the kernel trick, implicitly mapping their inputs into
for value in A.values(): high-dimensional feature spaces. It basically, draw margins
exs = examples[e.A == value] between the classes. The margins are drawn in such a
subtree = decisionTreeLearning(exs, attributes.remove(A), fashion that the distance between the margin and the classes
examples) is maximum and hence, minimizing the classification error.
# note implementation should probably wrap the trivial case
returns into trees for consistency
tree.addSubtreeAsBranch(subtree, label=(A, value)
return tree

Navie Bayes
It is a classification technique based on Bayes Theorem with
an assumption of independence among predictors. In simple
terms, a Naive Bayes classifier assumes that the presence of
a particular feature in a class is unrelated to the presence of
any other feature. Naïve Bayes mainly targets the text
classification industry. It is mainly used for clustering and
classification purpose depends on the conditional probability
of happening.
Figure: Support Vector Machine

Pseudo Code of Support Vector Machine

initialize Yi = YI for i ⋹ I

compute svm solution vv , b for data set with imputed labels
compute outputs ii = (vv , xi) + b for all xi in positive bags
set yi = sgn(fi) for every i e i, yi = 1
for (every positive bag bi) end
if (liei(l + yi)/2 == 0)
compute i* = arg maxiei ii
set yi* = 1
while (imputed labels have changed)
output (vv, b)
Figure: Unsupervised Learning
Unsupervised Learning:
These are called unsupervised learning because unlike Principal Component Analysis
supervised learning above there is no correct answers and Principal component analysis is a statistical procedure that
there is no teacher. Algorithms are left to their own devises uses an orthogonal transformation to convert a set of
to discover and present the interesting structure in the data. observations of possibly correlated variables into a set of
The unsupervised learning algorithms learn few features values of linearly uncorrelated variables called principal
from the data. When new data is introduced, it uses the components. In this the dimension of the data is reduced to
previously learned features to recognize the class of the data. make the computations faster and easier. It is used to explain
It is mainly used for clustering and feature reduction. the variance-covariance structure of a set of variables
through linear combinations. It is often used as a
dimensionality-reduction technique.

Figure: Principal Component Analysis

K-Means Clustering The next step is to take each point belonging to a given data
K-means is one of the simplest unsupervised learning set and associate it to the nearest center. When no point is
algorithms that solve the well known clustering problem. pending, the first step is completed and an early group age
The procedure follows a simple and easy way to classify a is done. At this point we need to re-calculate k new centroids
given data set through a certain number of clusters. The as bary center of the clusters resulting from the previous
main idea is to define k centers, one for each cluster. These step.
centers should be placed in a cunning way because of
different location causes different result. So, the better
choice is to place them is much as possible far away from
each other.

Figure: Pseudo Code of K-Means Clustering

Figure: K-Means Clustering
Semi Supervise Learning: learning approaches aim to solve just 1 task using 1
Semi-supervised machine learning is a combination particular model), where these n tasks or a subset of them
of supervised and unsupervised machine learning methods. are related to each other but not exactly identical, Multi-
It can be fruit-full in those areas of machine learning and Task Learning (MTL) will help in improving the learning of
data mining where the unlabeled data is already present and a particular model by using the knowledge contained in all
getting the labeled data is a tedious process. With more the n tasks.
common supervised machine learning methods, you train
a machine learning algorithm on a “labeled” dataset in Ensemble Learning
which each record includes the outcome information. The Ensemble learning is the process by which multiple models,
some of Semi Supervise learning algorithms are discussed such as classifiers or experts, are strategically generated and
below combined to solve a particular computational
intelligence problem. Ensemble learning is primarily used to
Transductive SVM improve the performance of a model, or reduce the
Transductive support vector machines (TSVM) has been likelihood of an unfortunate selection of a poor one. Other
widely used as a means of treating partially labeled data in applications of ensemble learning include assigning a
semisupervised learning. Around it, there has been mystery confidence to the decision made by the model, selecting
because of lack of understanding its foundation in optimal features, data fusion, incremental learning, non-
generalization. It is used to label the unlabeled data in such a stationary learning and error-correcting.
way that the margin is maximum between the labeled and
unlabeled data. Finding an exact solution by TSVM is a NP- Boosting:
hard problem. The term „Boosting‟ refers to a family of algorithms
which converts weak learner to strong learners. Boosting is a
Generative Models technique in ensemble learning which is used to decrease
A Generative model is the one that can generate data. It bias and variance. Boosting is based on the question posed
models both the features and the class (i.e. the complete by Kearns and Valiant “Can a set of weak learners create
data). If we model P(x,y): I can use this probability a single strong learner?" A weak learner is defined to be
distribution to generate data points - and hence all a classifier, a strong learner is a classifier that is arbitrarily
algorithms modeling P(x,y) are generative. One labeled well-correlated with the true classification.
example per component is enough to confirm the mixture

In self-training, a classifier is trained with a portion of
labeled data. The classifier is then fed with unlabeled data.
The unlabeled points and the predicted labels are added
together in the training set. This procedure is then repeated
further. Since the classifier is learning itself, hence the name

Reinforcement Learning
Reinforcement learning is an area of machine learning
concerned with how software agents ought to take actions in
an environment in order to maximize some notion of
cumulative reward. Reinforcement learning is one of three
basic machine learning paradigms, alongside supervised
learning and unsupervised learning.

Figure: Boosting Pseudo code

Bagging or bootstrap aggregating is applied where the
accuracy and stability of a machine learning algorithm needs
to be increased. It is applicable in classification and
Figure: Reinforcement Learning regression. Bagging also decreases variance and helps in
handling overfitting.
Multitask Learning
Multi-Task learning is a sub-field of Machine Learning that
aims to solve multiple different tasks at the same time, by
taking advantage of the similarities between different tasks.
This can improve the learning efficiency and also act as a
regularize. Formally, if there are n tasks (conventional deep

Unsupervised Neural Network
The neural network has no prior clue about the output the
input. The main job of the network is to categorize the data
according to some similarities. The neural network checks
the correlation between various inputs and groups them.

Figure: Pseudo code of Bagging

Neural Networks
A neural network is a series of algorithms that endeavors to
recognize underlying relationships in a set of data through a Figure: Unsupervised Neural Network
process that mimics the way the human brain operates. In
this sense, neural networks refer to systems of neurons, Reinforced Neural Network
either organic or artificial in nature. Neural networks can Reinforcement learning refers to goal-oriented algorithms,
adapt to changing input; so the network generates the best which learn how to attain a complex objective (goal) or
possible result without needing to redesign the output maximize along a particular dimension over many steps; for
criteria. The concept of neural networks, which has its roots example, maximize the points won in a game over many
in artificial intelligence, is swiftly gaining popularity in the moves. They can start from a blank slate, and under the right
development of trading systems. conditions they achieve superhuman performance. Like a
child incentivized by spankings and candy, these algorithms
are penalized when they make the wrong decisions and
rewarded when they make the right ones – this is

Figure: Neural Networks

An artificial neural network behaves the same way. It works

on three layers. The input layer takes input. The hidden layer
processes the input. Finally, the output layer sends the
calculated output.

Supervised Neural Network

In the supervised neural network, the output of the input is
already known. The predicted output of the neural network
is compared with the actual output. Based on the error, the Figure: Reinforced Neural Network
parameters are changed, and then fed into the neural network
again. Supervised neural network is used in feed forward Instance-Based Learning
neural network. Instance-based learning refers to a family of techniques
for classification and regression, which produce a class
label/predication based on the similarity of the query to its
nearest neighbor(s) in the training set. In explicit contrast to
other methods such as decision trees and neural networks,
instance-based learning algorithms do not create an
abstraction from specific instances. Rather, they simply store
all the data, and at query time derive an answer from an
examination of the queries nearest neighbour (s).

K-Nearest Neighbor
The k-nearest neighbors (KNN) algorithm is a simple,
supervised machine learning algorithm that can be used to
Figure: Supervised Neural Network solve both classification and regression problems. It's easy to
implement and understand, but has a major drawback of

becoming significantly slows as the size of that data in use

Figure: Pseudo code of KNN

2. Conclusion
Machine Learning can be a Supervised or Unsupervised. If
you have lesser amount of data and clearly labelled data for
training, opt for Supervised Learning. Unsupervised
Learning would generally give better performance and
results for large data sets. If you have a huge data set easily
available, go for deep learning techniques. You also have
learned Reinforcement Learning and Deep Reinforcement
Learning. You now know what Neural Networks are, their
applications and limitations. This paper surveys various
machine learning algorithms. Today each and every person
is using machine learning knowingly or unknowingly. From
getting a recommended product in online shopping to
updating photos in social networking sites. This paper gives
an introduction to most of the popular machine learning

