Markov chains are quite common, intuitive, and have been used in multiple domains like automating content creation, text generation, finance modeling, cruise control systems, etc. The famous brand Google uses the Markov chain in their page ranking algorithm to determine the search order.
Markov chains are relatively simple and do not require any mathematical concept or advanced statistics knowledge for implementation. If you have a good understanding of Markov chains, then it becomes easier to learn probabilistic modeling and data science techniques.
This article will give you a deep understanding of what Markov chains are and how they work, with the help of examples.
What is a Markov Chain?
A Markov chain is a mathematical model that provides probabilities or predictions for the next state based solely on the previous event state. The predictions generated by the Markov chain are as good as they would be made by observing the entire history of that scenario.
It is a model that experiences transitioning from one state to the other state based on some probability conditions. One characteristic that defines the Markov chain is that no matter how the current state is achieved, the future states are fixed. The possible outcome of the next state is solely dependent on the current state and the time between the states.
Markov Chain Concept with Examples
Suppose you want to predict weather conditions for tomorrow. But you already know that there could be only two possible states for weather i.e. cloudy and sunny. How will you predict the next day’s weather using Markov chains?
Well, you will start observing the current weather state and it could be either sunny or cloudy. Suppose it is sunny today. The climate condition always goes through several transitions. You will gather weather data over the past years and calculate that the chances of getting a cloudy day after a sunny day are 0.35.
You have also observed that the chances of getting a sunny day after a sunny day are 0.65. This distribution will help you in predicting that the next day is going to be sunny as well. That’s how the current weather state helps you in predicting the future state and you can apply the same logic to predict weather conditions for the days to come.
The above example illustrates Markov’s property that the Markov chain is memoryless. The next day weather conditions are not dependent on the steps that led to the current day weather condition. The probability distribution is arrived only by experiencing the transition from the current day to the next day.
Another example of the Markov chain is the eating habits of a person who eats only fruits, vegetables, or meat. The eating habits are governed by the following rules:
- The person eats only one time in a day.
- If a person ate fruits today, then tomorrow he will eat vegetables or meat with equal probability.
- If he ate vegetables today, then tomorrow he will eat vegetables with a probability of 1/10, fruits with a probability of 1/40, and meat with a probability of 1/50.
- If he ate meat today, then tomorrow he will eat vegetables with a probability of 4/10, fruits with a probability of 6/10. He will not eat meat again tomorrow.
You can easily model his eating habits using Markov chains since its choice for the next day depends solely on what he ate today irrespective of what he ate yesterday or the day before.
Also Read: Introduction to Markov Chains
Markov Chain Transition Matrix
So far, we have seen how we can predict the probability of transitioning from one state to another. But how about finding the probability distribution of transitions occurring over several steps. You can find out the probability distribution of transitions over multiple steps using the Markov chain transition matrix.
The Markov chain transition matrix is nothing but the probability distribution of transitions from one state to another. It is called a transition matrix because it displays the transitions between different possible states.
The probability associated with each state is called the probability distribution of that state. It is the most important tool that is used in analyzing the Markov chain. For example, if there are N number of possible states, then the transition matrix (P) would be as follows
P = N x N matrix
Where an entry in a row (I, J) represents the probability of transitioning from the state I to state J. Each row of the transition matrix P should sum to 1.
To represent a Markov chain, you will also need an initial state vector that describes the starting at each of the N possible states. You can represent the initial state vector (X) as
X = N x 1 matrix
Suppose you want to find out the probability of transitioning from the state I to state J over M multiple steps. You have given three possible states i.e. bull market, bear market, and stagnant market.
In the above example, the first column of the transition matrix indicates the bull market state, the second bear market, and the third indicates the stagnant market. The rows also correspond in a similar fashion.
In the transition matrix, the probability of transition is calculated by raising P to the power of the number of steps (M). For a 3-step transition, you can determine the probability by raising P to 3.
By multiplying the above P3 matrix, you can calculate the probability distribution of transitioning from one state to another.
Our learners also read: Learn Python Online for Free
Since you have understood how the Markov chain works, you can easily implement them in any problem statement either to reach a solution or to automate. Markov chains are very powerful and provide a foundation for other more advanced modeling techniques.
The understanding of the Markov chain can lead you in gaining deeper knowledge in several techniques like brief modeling and sampling.
If you are curious to learn about python, data science, check out IIIT-B & upGrad’s PG Diploma in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.
Are there any interesting real-life use cases for the Markov chain?
Yes, there are plenty of interesting real-life use cases of Markov chains, from text creation to financial modeling. Most of the text generators use the Markov networks. The chain system is widely used to generate fake texts, oversized articles, and compile speeches. The name generators we usually see on the internet also use the Markov chain. Another well-known application of Markov chains is predicting forthcoming words. They are also helpful for auto-completion and recommendations. The Google PageRank and the Subreddit Simulator are prominent examples, which employ Markov chains to automate the production of material for an entire subreddit.
Is the Markov chain critical while learning Data Science?
Even though Markov chains are not compulsory for Data Science learners, they can provide an excellent approach to learning probabilistic modeling and data science techniques. Markov Chains are theoretically reasonably simple, and they can be implemented without the need for any complex statistical or mathematical ideas. The most prominent application of Data science is making predictions, and Data Scientists use the Conditional Probability of Markov Chains to make these predictions. It is named after the memoryless feature of Stochastic Processes, which says that the distribution of future states of any process is determined only by the current state of those processes.
How does Markov chain help in Google's PageRank Algorithm?
Google's PageRank Algorithm is a well-known link-based ranking algorithm. Rather than evaluating the pages based on their content, Page rank ranks them based on their interconnected structure. By examining simply the present state, the Markov Chain can assist in anticipating the behavior of a system in transition from one state to another.
When a user inputs a query into a search engine, the PageRank algorithm identifies sites on the web that match the query word and shows those pages to the user in the order of their PageRank by using the Markov network. The PageRank algorithm determines the significance of a website solely based on the web's link structure rather than the page's content.