Continuation from  Part-1


Decision trees:

These are type of machine learning algorithms which is very good tool to choose between several courses of actions. It is choosing by expected outcomes from the given data. They are shown as inverted trees with root at the top.

Example: We can build a decision tree for a simple task of going to office. At the root, you decide to go by bus or car, if it is bus you might need decide the route to walk to bus stop to catch the bus and again you would choose between bus numbers. On the other hand, choosing to take car, you might take a different route, check for fuel, fill fuel is required etc.

Deep learning:

Deep learning is a machine learning technique that teaches computers to do what comes naturally to humans: learn by example. Deep learning is a key technology behind driverless cars, enabling them to recognize a stop sign, or to distinguish a pedestrian from a lamppost.

Dimensionality reduction:

This is the process of reducing the number of random variables under consideration by obtaining a set of principal variables. Commonly in machine learning we try to reduce unwanted variables which we think do not contribute to the model selection.

Discrete variable:

This is type of variable which can take a finite set of values.

Example: Age of students in a class.

Feature engineering:

This is a process of creating new features using the domain knowledge of the given data which helps Machine Learning algorithms work.

Example: A German Car manufacturer planning to launch a car model in India might consider increasing the ground clearance due to distorted road conditions and increasing / decreasing length of car to comply to the compliance parameters. These might be considered as feature engineering.

Gradient boosting:

It is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. This is used to improve the accuracy of the model.

Gradient descent:

Gradient descent is an iterative optimization algorithm for finding the minimum of a function. This method is used to reduce the prediction error. Lets say you have a model with some predicted values. You now calculate the prediction error (difference between predicted and actual) which usually is an equation as Y = a + bx, where ‘Y’ is the predicted value, a and b are the weights assigned to the equation and ‘x’ is the variable. So Gradient Descent tries to find the optimal weights for a and b so that it closes the gap for Y.


A histogram is a graphical display of data using bars of different heights like box plot. A histogram displays the single quantitative variable along the x axis and frequency of that variable on the y axis.

Example: If you want to see number of girls and boy in a class you use box plot (a way of constructing histogram) where ‘girls’ and ‘boys’ represent x-axis and number of girls and boys represent y-axis.

k-means clustering:

K-means clustering is a type of unsupervised learning, and is particularly useful to tell something about the data. In unsupervised learning you do not train the algorithm with data as in the case of supervised learning and hence very useful. K means the number of categories or groups it will classify the data over a few iterative steps.

k-nearest neighbors:

k-nearest neighbors (or k-NN for short) is a simple machine learning algorithm that categorizes an input by using its nearest neighbors. This does not take any input distributions into consideration hence very useful for applications that do not input properties.

Example: Fruits and Vegetables can be classified based on their taste. Fruits are sweeter than vegetables and hence it classifies them accordingly. But it does not take into consideration the distribution of them like how many fruits or vegetables.

Linear regression:

It is a machine algorithm used to predict relationship between 2 variables. This is used on datasets where a continuous variable needs to be predicted. Example: Sales forecast, Housing price prediction.

Logistic regression:

Logistic regression is used when you need to predict one of the two outcomes given a set of data.

Example: Predicting head when a coin is tossed.

—continued in Part-3

Part -1

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.