Bayesian Networks: Introduction, Examples and Practical Applications

All those who have ever worked with data or statistics know one thing for sure: correlation does not necessarily mean or imply causation. Now, while this may sound pretty obvious, it might shock you to learn that most errors in data happen because of the confusion between the two terms. This is primarily because while it is convenient to define correlation, it is almost impossible to define or quantify causation.

In fact, Judea Pearl, author of Causality: Models, Reasoning, and Inference, states in the book that humans focus their mathematical efforts on probabilistic and statistical inferences, leaving causal considerations “to the mercy of intuition and good judgement.” He says that this is a major factor that we’re still greatly behind in terms of scientific progress. 

This is when Bayesian Networks make it easy for us. They help us distinguish correlation from causation by allowing us to see various independent causes at once. All this is done accurately as machine learning algorithms do not work on subjectivity or intuition; they work on data.

Let’s see an example to understand how Bayesian Networks operate.

Example of Bayesian Networks

For the sake of this example, let us suppose that the world is stricken by an extremely rare yet fatal disease; say there is a 1 in 1000 chance that you are infected by the disease.

Now, to figure whether someone is suffering from the disease, doctors develop a test. The catch is it is only 99% accurate.

How will you know for sure whether you have the disease or not? Will taking another test affect the results?

Let’s see what happens when you conduct…

Test 1

As the disease affects only 1 in a 1000, the probability of you being infected is:

Infected  0.001
Free 0.999

Disease CPT (Conditional Probability Table)

Clearly, just as 1 in 1000 has a chance of suffering from the disease, 999 in 1000 are free from it.

Similarly, we will create a table to calculate the probability of the test. As mentioned before, if the test in only 99% accurate. That means that there is only a 99% chance that the result is true. Similar is the case with negative results.

Virus Presence Infected Free
Test 1 (Positive) 0.99 0.01
Test 1 (Negative) 0.01 0.99

Test1 CPT (Conditional Probability Table)

Now, let’s plot a graph to see how the presence of the disease is affected by the test results.

Filling these cells with the results of the test will give me the following result.

Image source

As you can see, if the test comes out to be positive, there is only a 9% chance that you are suffering from the disease.

Now, how did we get this number?

Bayes Theorem!

Image source

In our example,

P(H|E) = P(H) x P(E|H) / P(E)

  • P(H|E) = P(H) x P(E|H) / {P(E|H) x P(H) + P(E|Hc) x P(Ec)}
  • P(H|E) = (0.99 x 0.001) / (0.001 x 0.99 + 0.999 x 0.01) = 0.9 = 9%

What does this tell us?

Even when the test is positive, due to the disease being rare, there is only a 9% chance of having the disease.

So, then, what happens when you take another test to be sure and it, too, turns out to be positive.

Read: Machine Learning Project Ideas for Beginners

Test 2

Again, the second test is also only going to be 99% accurate.

Virus Presence Infected Free
Test 2 (Positive) 0.99 0.01
Test 2 (Negative) 0.01 0.99

 

The Bayesian Network now would be:

Image source

The results have reversed!

This means that if you get two positive results on two tests, the odds of being infected by the virus increase from 9% to 91%.  But again, it doesn’t say 100%!

Now, what if you get one positive and one negative result from the test?

Image source

As you can see, there is a 100% chance that you don’t have the disease in case one of the two tests is negative.

Test 3

It gets even better when you conduct three tests and all of them come out to be true.

Image source

Clearly, now, there is a 100% chance that you’re infected.

Now let’s see what happens when one of the tests is negative but the other two are positive.

Image source

Again, the results are 91% positive for the presence of a virus.

Bayesian Networks and Data Modeling

In the example above, it can be seen that Bayesian Networks play a significant role when it comes to modeling data to deliver accurate results.

In fact, refining the network by including more factors that might affect the result also allows us to visualize and simulate different scenarios using Bayesian Networks.

Bayesian Networks are also a great tool to quantify unfairness in data and curate techniques to decrease this unfairness.

In such cases, it is best to use path-specific techniques to identify sensitive factors that affect the end results.

Top 5 Practical Applications of Bayesian Networks

Bayesian Networks are being widely used in the data science field to get accurate results with uncertain data.

Applications of Bayesian Networks

1. Spam Filter

You must be lying if you say that you’ve never wondered how Gmail filters spam emails (unwanted and unsolicited emails. It uses Bayesian spam filter, which is the most robust filter.

2. Turbo Code

Bayesian Networks are used to create turbo codes that are high-performance forward error correction codes. These are used in 3G and 4G mobile networks.

3. Image Processing

Bayesian Networks use mathematical operations to convert images into digital format. It also allows image enhancement. 

4. Biomonitoring

Quantifying the concentration of chemicals couldn’t get any easier than with Bayesian Networks. In this, the amount of blood and tissue in humans is measured using indicators.

5. Gene Regulatory Network (GNR)

A GNR contains various DNA segments of a cell that interact with other cell contents through protein and RNA expression products. The predictions of its behavior can be analyzed using Bayesian Networks.

Conclusion

In this online blog post, you learned about how Bayesian Networks help us get accurate results from the data at hand. Even the littles variation in data can significantly affect the end result. Bayesian Networks help us analyze data using causation instead of just correlation.

They have proved to be revolutionary in the data science field. Clearly, taking up a career in this science can help you get your dream job. So, enrol in one of our courses in data science and learn from the experts! We also offer free career support from top-notch and experienced career counsellors. Download the brochure to learn more about the course. 

If you would like to know more about careers in Machine Learning and Artificial Intelligence, check out IIT Madras and upGrad’s Advanced Certification in Machine Learning and Cloud. 

Lead the AI Driven Technological Revolution

ADVANCED CERTIFICATION IN MACHINE LEARNING AND CLOUD FROM IIT MADRAS & UPGRAD
Learn More

Leave a comment

Your email address will not be published. Required fields are marked *

×
Know More
Download EBook
Download EBook
By clicking Download EBook, you agree to our terms and conditions and our privacy policy.