Netflix and Amazon have gotten pretty great at their game – they always seem to know what content or product you’d love to see / purchase. Don’t you just love to see everything already curated to your taste and preference?
While most of us know the secret sauce behind the nifty Recommendation Engine of Netflix and Amazon (Machine Learning, of course!), how many of us are familiar with the inner mechanisms of Machine Learning?
To put it straight – How does Machine Learning work?
In essence, Machine Learning is a data analytics technique (a subset of AI) that aims to “learn” from experience and enable machines to perform tasks that require intelligence. Machine Learning algorithms apply computational methods to extract information and learn directly from data without being explicitly programmed for it (not having to depend on a predetermined equation).
The Anatomy of Machine Learning systems
All ML systems can be disintegrated into three parts:
- Model – the component that deals with the identifications, that is, predictions.
- Parameters – refers to the factors used by the model to reach its decisions (predictions).
- Learner – the component that adjusts the parameters (and as a whole, the model) by considering the differences in predictions compared to the actual outcome.
Types of Machine Learning
Now that you are familiar with the core components of ML systems, it’s time to take a look at the different ways they “learn.”
In Supervised Learning, a model is explicitly trained on how to map the input to the output. A supervised learning algorithm takes a recognized set of input data along with known responses (output) to that data and trains the model to generate reasonable predictions in response to new input data.
Supervised learning uses two approaches to develop predictive models –
- Classification – As the name suggests, this technique classifies input data into different categories by labelling them. It is used to predict discrete responses (for instance, if a cancerous cell is benign or malignant). Medical imaging, speech recognition, and credit scoring are three popular use cases of classification.
- Regression – This technique is used to predict continuous responses by identifying the patterns in the input data. For instance, fluctuations in temperature or weather. Regression is used to forecast the weather, electricity load, and algorithmic trading.
Unsupervised Learning approach uses unlabeled data and seeks to unravel the hidden patterns within it. Thus, the technique draws inferences from datasets consisting of input data devoid of labelled responses.
- Clustering – One of the most common unsupervised learning methods, clustering is an exploratory data analysis technique that categorizes data into “clusters” without any known information about the cluster credentials. Object recognition and gene sequence analysis are two examples of clustering.
- Dimensionality Reduction – Dimensionality Reduction cleanses the input data of all the redundant information and retains only the essential parts. Thus, the data not only becomes clean, but it also reduces in size, thereby taking up less storage space.
Reinforcement Learning aims to build self-sustained and self-learning models that can learn and improve through trial and error. In the learning (training) process, if the algorithm can successfully perform specific actions, reward signals are triggered. The reward signals function like guiding lights for the algorithms. There are two reward signals:
- A Positive signal is triggered to encourage and continue a particular sequence of action.
- A Negative signal is a penalty for a particular wrong action. It demands the correction of mistake before proceeding further in the training process.
Reinforcement Learning is widely used in video games. It is also the mechanism behind self-driving cars.
Inside the ‘learning’ function of ML algorithms
Behind the functionings of ML algorithms and how they learn through experience, there are three common principles.
Learning a Function
The first step in the learning process is where ML algorithms learn about the target function (f) that best maps the input variable (X) to the output variable (Y). So,
Y = f(X).
Here, the form of the target function (f) is unknown, hence the predictive modelling.
In this general learning phase, the ML algorithm learns how to make future predictions (Y) based on the new input variables (X). Naturally, the process isn’t free of error. Here error (e) exists independent of the input data (X). So,
Y = f(X) + e
Since the error (e) might not have enough attributes to characterize the mapping scenario from X to Y best, it is called irreducible error – irrespective of how good the algorithm gets at estimating the target function (f), you cannot reduce the error (e).
Making predictions and learning how to improve them
In the earlier point, we understood how an ML algorithm learns a target function (f). And we already know that our only and only goal here is to find the best possible way to map Y from X. In other words, we need to find the most accurate way to map the input to the output.
There will be errors (e), yes, but the algorithm has to keep trying to understand how far off it is from the desired output (Y) and how to reach it. In this process, it will continually adjust the parameters or the input values (X) to best match the output (Y). This will continue until it reaches a high-degree of semblance and accuracy with the desired output model.
The ‘Gradient Descent’ learning approach
It may be true that we have been successful in creating ‘intelligent’ machines, but their pace of learning differs – machines tend to take it slow. They believe in the “gradient descent” learning process – you don’t take the leap at once, but you take baby steps and slowly descend from the top (the metaphor here is that of climbing down a mountain).
While descending a mountain, you don’t jump or run or hurl yourself down in one go; instead, you take measured and calculated steps to get down to the bottom safely and avoid mishaps.
ML algorithms use this approach – they keep adjusting themselves to the changing parameters (picture the rough and unexplored terrain of a mountain again) to get the desired outcome finally.
The fundamental goal of all Machine Learning algorithms is to develop a predictive model that best generalizes to specific input data. Since ML algorithms and systems train themselves through different kinds of inputs/variables/parameters, it is imperative to have a vast pool of data. This is to allow the ML algorithms to interact with different kinds of data to learn their behaviour and produce the desired outcomes.
We hope that with this post we could demystify the workings of Machine Learning for you!