Blog_Banner_Asset
    Homebreadcumb forward arrow iconBlogbreadcumb forward arrow iconArtificial Intelligencebreadcumb forward arrow iconUnderstanding Bayesian Decision Theory With Simple Example

Understanding Bayesian Decision Theory With Simple Example

Last updated:
24th Dec, 2020
Views
Read Time
8 Mins
share image icon
In this article
Chevron in toc
View All
Understanding Bayesian Decision Theory With Simple Example

Introduction

We encounter lots of classification problems in real life. For example, an electronic store might need to know whether a particular customer based on a certain age, is going to buy a computer or not. Through this article, we are going to introduce a method named ‘Bayesian Decision Theory’ which helps us in making decisions on whether to select a class with ‘x’ probability or an opposite class with ‘y’ probability based on a certain feature. 

Best Machine Learning and AI Courses Online

Definition

Bayesian Decision Theory is a simple but fundamental approach to a variety of problems like pattern classification. The entire purpose of the Bayes Decision Theory is to help us select decisions that will cost us the least ‘risk’. There is always some sort of risk attached to any decision we choose. We will be going through the risk involved in this classification later in this article.

In-demand Machine Learning Skills

Ads of upGrad blog

Get Machine Learning Certification from the World’s top Universities. Earn Masters, Executive PGP, or Advanced Certificate Programs to fast-track your career.

Basic Decision

Let us take an example where an electronics store company wants to know whether a customer is going to buy a computer or not. So we have the following two buying classes:

w1 – Yes (Customer will buy a computer)

w2 – No (Customer will not buy a computer)

Now, we will look into the past records of our customer database. We will note down the number of customers buying computers and also the number of customers not buying a computer. Now, we will calculate the probabilities of customers buying a computer. Let it be P(w1). Similarly, the probability of customers not buying a customer is P(w2).

Now we will do a basic comparison for our future customers.

For a new customer,

If P(w1) > P(w2), then the customer will buy a computer (w1)

And, if P(w2) > P(w1), then the customer will not buy a computer (w2)

Here, we have solved our decision problem.

But, what is the problem with this basic Decision method? Well, most of you might have guessed right. Based on just previous records, it will always give the same decision for all future customers. This is illogical and absurd.

So we need something that will help us in making better decisions for future customers. We do that by introducing some features. Let’s say we add a feature ‘x’ where ‘x’ denotes the age of the customer. Now with this added feature, we will be able to make better decisions.

To do this, we need to know what Bayes Theorem is.

Read: Types of Supervised Learning

Bayes Theorem and Decision Theory

For our class w1 and feature ‘x’, we have:  

P(w1 | x)= P(x | w1) * P(w1)P(x)

There are 4 terms in this formula that we need to understand:

  1. Prior – P(w1) is the Prior Probability that w1 is true before the data is observed
  2. Posterior – P(w1 | x) is the Posterior Probability that w1 is true after the data is observed.
  3. Evidence – P(x) is the Total Probability of the Data
  4. Likelihood – P(x | w1) is the information about w1 provided by ‘x’

P(w1 | x) is read as Probability of w1 given x

More Precisely, it is the probability that a customer will buy a computer, given a specific customer’s age.

Now, we are ready to make our decision:

For a new customer,

If P(w1 | x) > P(w2 | x), then the customer will buy a computer (w1)

And, if P(w2 | x) > P(w1 | x), then the customer will not buy a computer (w2)

This decision seems more logical and trustworthy since we have some features here to work upon and our decision is based on the features of our new customers and also past records and not just past records as in earlier cases.

Now, from the formula, you can see that for both our classes w1 and w2, our denominator P(x) is constant. So, we can utilize this idea and can form another form of decision as below:

If P(x | w1)*P(w1) > P(x | w2)*P(w2), then the customer will buy a computer (w1)

And, if P(x | w2)*P(w2) > P(x | w1)*P(w1), then the customer will not buy a computer (w2)

We can notice an interesting fact here. If somehow, our prior probabilities P(w1) and P(w2) are equal, we can still be able to make our decision based on our likelihood probabilities P(x | w1) and  P(x | w2). Similarly, if our likelihood probabilities are equal, we can make decisions based on our prior probabilities P(w1) and P(w2).

Must Read: Types of Regression Models in Machine Learning

Risk Calculation

As mentioned earlier, there is always going to be some amount of ‘risk’ or error made in the decision. So, we also need to determine the probability of error made in a decision. This is very simple and I will demonstrate that in terms of visualizations.

Let us consider we have some data and we have made a decision according to Bayesian Decision Theory.

We get a graph somewhat like below:

The y-axis is the posterior probability P(w(i) | x) and the x-axis is our feature ‘x’. The axis where the posterior probability for both the classes is equal, that axis is called our decision boundary.

So at Decision Boundary:

P(w1 | x) = P(w2 | x)

So to the left of the decision boundary, we decide in favor of w1(buying a computer) and to the right of the decision boundary, we decide in favor of w2(not buying a computer).

But, as you can see in the graph, there is some non-zero magnitude of w2 to the left of the decision boundary. Also, there is some non-zero magnitude of w1 to the right of the decision boundary. This extension of another class over another class is what you call a risk or probability error.

Calculation of Probability Error

To calculate the probability of error for class w1, we need to find the probability that the class is w2 in the area that is to the left of the decision boundary. Similarly, the probability of error for class w2 is the probability that the class is w1 in the area that is to the right of the decision boundary.

Mathematically speaking, the minimum error for class:

w1 is P(w2 | x)

And for class w2 is P(w1 | x)

You got your desired probability error. Simple, isn’t it?

So what is the total error now?

Let us denote the probability of total error for a feature x to be P(E | x). Total error for a feature x would be the sum of all the probabilities of error for that feature x. Using simple integration, we can solve this and the result we get is:

P(E | x) = minimum (P(w1 | x) , P(w2 | x))

Ads of upGrad blog

Therefore, our probability of total error is the minimum of the posterior probability for both the classes. We are taking the minimum of a class because ultimately we will give a decision based on the other class.

Popular AI and ML Blogs & Free Courses

Conclusion

We have looked in detail at the discrete applications of Bayesian Decision Theory. You now know Bayes Theorem and its terms. You also know how to apply Bayes Theorem in making a decision. You have also learned how to determine the error in the decision you have made.

If you’re interested to learn more about machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.

Profile

Pavan Vadapalli

Blog Author
Director of Engineering @ upGrad. Motivated to leverage technology to solve problems. Seasoned leader for startups and fast moving orgs. Working on solving problems of scale and long term technology strategy.
Get Free Consultation

Select Coursecaret down icon
Selectcaret down icon
By clicking 'Submit' you Agree to  
UpGrad's Terms & Conditions

Our Popular Machine Learning Course

Frequently Asked Questions (FAQs)

1What is Bayes Theorem in probability?

In the field of Probability, Bayes Theorem refers to a mathematical formula. This formula is used to calculate the conditional probability of a specific event. Conditional probability is nothing but the possibility of occurrence of any particular event, which is based on the outcome of an event that has already taken place. In calculating the conditional probability of an event, Bayes Theorem considers the knowledge of all conditions related to that event. So, if we are already aware of the conditional probability, it becomes easier to calculate the reverse probabilities with the help of Bayes Theorem.

2Is Bayes Theorem useful in machine learning?

Bayes Theorem is extensively applied in machine learning and artificial intelligence projects. It offers a way to connect a machine learning model with an available dataset. Bayes Theorem provides a probabilistic model that describes the association between a hypothesis and data. You can consider a machine learning model or algorithm as a specific framework that explains the structured associations in the data. So using Bayes Theorem in applied machine learning, you can test and analyze different hypotheses or models based on different sets of data and calculate the probability of a hypothesis based on its prior probability. The target is to identify the hypothesis that best explains a particular data set.

3What are the most popular Bayesian machine learning applications?

In data analytics, Bayesian machine learning is one of the most powerful tools available to data scientists. One of the most fantastic examples of real-world Bayesian machine learning applications is detecting credit card frauds. Bayesian machine learning algorithms can help detect patterns that suggest potential credit card frauds. Bayes Theorem in machine learning is also used in advanced medical diagnosis and calculates the probability of patients developing a specific ailment based on their previous health data. Other significant applications include teaching robots to make decisions, predicting the weather, recognizing emotions from speech, etc.

Explore Free Courses

Suggested Blogs

Artificial Intelligence course fees
5360
Artificial intelligence (AI) was one of the most used words in 2023, which emphasizes how important and widespread this technology has become. If you
Read More

by venkatesh Rajanala

29 Feb 2024

Artificial Intelligence in Banking 2024: Examples & Challenges
6064
Introduction Millennials and their changing preferences have led to a wide-scale disruption of daily processes in many industries and a simultaneous g
Read More

by Pavan Vadapalli

27 Feb 2024

Top 9 Python Libraries for Machine Learning in 2024
75547
Machine learning is the most algorithm-intense field in computer science. Gone are those days when people had to code all algorithms for machine learn
Read More

by upGrad

19 Feb 2024

Top 15 IoT Interview Questions & Answers 2024 – For Beginners & Experienced
64395
These days, the minute you indulge in any technology-oriented discussion, interview questions on cloud computing come up in some form or the other. Th
Read More

by Kechit Goyal

19 Feb 2024

Data Preprocessing in Machine Learning: 7 Easy Steps To Follow
152588
Summary: In this article, you will learn about data preprocessing in Machine Learning: 7 easy steps to follow. Acquire the dataset Import all the cr
Read More

by Kechit Goyal

18 Feb 2024

Artificial Intelligence Salary in India [For Beginners & Experienced] in 2024
908593
Artificial Intelligence (AI) has been one of the hottest buzzwords in the tech sphere for quite some time now. As Data Science is advancing, both AI a
Read More

by upGrad

18 Feb 2024

24 Exciting IoT Project Ideas & Topics For Beginners 2024 [Latest]
758981
Summary: In this article, you will learn the 24 Exciting IoT Project Ideas & Topics. Take a glimpse at the project ideas listed below. Smart Agr
Read More

by Kechit Goyal

18 Feb 2024

Natural Language Processing (NLP) Projects & Topics For Beginners [2023]
107522
What are Natural Language Processing Projects? NLP project ideas advanced encompass various applications and research areas that leverage computation
Read More

by Pavan Vadapalli

17 Feb 2024

45+ Interesting Machine Learning Project Ideas For Beginners [2024]
327951
Summary: In this Article, you will learn Stock Prices Predictor Sports Predictor Develop A Sentiment Analyzer Enhance Healthcare Prepare ML Algorith
Read More

by Jaideep Khare

16 Feb 2024

Schedule 1:1 free counsellingTalk to Career Expert
icon
footer sticky close icon