  Home  Blog  Artificial Intelligence  Probability Distribution: Types of Distributions Explained

# Probability Distribution: Types of Distributions Explained

Last updated:
24th Jun, 2023
Views
9 Mins    View All  ## Introduction to Probability and Probability Distribution

In order to understand probability distribution, let us first understand what probability is. Probability is the measure of the likelihood of an event occurring in an experiment. In simple terms, it tells us how likely is it that the event will occur. The value of the probability of an event occurring ranges from 0 (being least probable) to 1 (being most probable).

The probability distribution is a function that provides the probabilities of different outcomes for experimentation. It shows the possible values that a random variable can take and how often do these values occur.

In probability distribution, the sum of all these probabilities always aggregates to 1. In the data science domain, one of the usages of the probability distribution is for calculating confidence intervals and for calculating the critical regions in the hypothesis tests.

## Top Machine Learning and AI Courses Online

 Master of Science in Machine Learning & AI from LJMU Executive Post Graduate Programme in Machine Learning & AI from IIITB Advanced Certificate Programme in Machine Learning & NLP from IIITB Advanced Certificate Programme in Machine Learning & Deep Learning from IIITB Executive Post Graduate Program in Data Science & Machine Learning from University of Maryland To Explore all our certification courses on AI & ML, kindly visit our page below. Machine Learning Certification

## Continuous and Discrete Distributions

The type of probability distribution to be used depends upon whether the variable contains discrete values or continuous values. A discrete distribution can only take a limited set of values whereas continuous distributions can take in any value within the specified range.

The continuous distributions are represented in terms of probability density as there can be infinite values in a certain range and the probability of each value will be zero. In the case of discrete distribution, we can obtain a probability for each value as the number of values is limited.

## Trending Machine Learning Skills

 AI Courses Tableau Certification Natural Language Processing Deep Learning AI

## Types of Distributions – Discrete Distribution

### Binomial Distribution

It is a type of distribution where the number of outcomes in a single trial is only two. Each trial is independent of another trial; that is, the outcome of each trial does not have an impact on the outcome of other trials. The trials that are conducted in this experiment are identical to each other.

Thus, the probability of success and failure would be the same for each trial. For example, if the probability of success for a trial is 0.8 (which means the probability of failure would be 0.2), then it will be the same for the rest of the trials as well.

### Multi nominal Distribution

This is the generalized version of binomial distribution where the number of outcomes can be greater than two. The other properties of this distribution are similar to that of the binomial distribution. For example, consider when a fair die is rolled, the probability of each outcome is going to be the same for all trials as these trials are independent of each other.

### Bernoulli’s Distribution

This is another variant of Binomial distribution. It is a special case of Binomial distribution where the number of trials conducted in an experiment is 1 (n = 1). As there is only one trial, it can be defined using only one parameter (p) which is generally the probability of success.

### Negative Binomial Distribution

The following conditions in a negative binomial distribution differ from the binomial distribution: –

• The number of trials conducted in an experiment is not fixed.
• The random variable indicates the number of trials required to attain a desired number of successes.

For binomial distribution, the random variable is the number of successes required i.e. We focus only on the number of successes no matter how many trails fail. But in the case of negative binomial, it focuses on how many trials will be required for achieving the number of successes i.e. The number of failures (negatives) is also brought into consideration which is why it is called a negative binomial distribution.

The process is continued only till the desired number of successes have been attained. This causes the number of trials for an experiment to be arbitrary. It is also called Pascal Distribution.

### Poisson Distribution

Poisson Distribution provides the probability of a discrete number of events occurring in a specific period of time, provided we know the average number of events that occurred during the same period. These events occur independently and have no effect over other events. For implementing this distribution, it assumes that the rate of occurrence remains constant over the time period.

### Discrete Uniform Distribution

In uniform distribution, the probabilities of all the outcomes are equal. For example, consider when a fair die is rolled, the probability of any outcome ranging from 1 to 6 is going to be equal. The probability mass function of this distribution is 1/n where n is the total number of discrete values.

## Types of Distributions – Continuous Distribution

### Continuous Uniform Distribution

The uniformity in the distribution can be applied to continuous values as well. It indicates that the probability distribution is uniform between the specified range. It is also called a rectangular distribution due to the shape it takes when plotted on a graph.

### Normal Distribution

A normal distribution (also known as a bell curve) is a type of continuous distribution that is symmetrical from both the ends of the mean. It generally indicates the one-half of the samples lie on the left side of the mean, while the other half lies on the right side. For a normal distribution, the mean, the mode, and the median are equal.

Normally distributed data generally follow the empirical rule. The empirical rule shows the spread of the data in terms of standard deviation and mean as follows: –

• 68% probability that the random variable falls within 1 standard deviation of the mean.
• 95% probability that the random variable falls within 2 standard deviations of the mean.
• 99.7% probability that the random variable falls within 3 standard deviations of the mean.

### T – Distribution

It is similar to a normal distribution, but it has a higher probability towards the extreme values of the data. This makes it more liable to take values that are farther from the mean. When plotted on a graph, the curve seems shorter and fatter than the normal distribution curve.

It is preferred when the number of samples is smaller in size. With the increase in the size of samples, the t-distribution curve starts to appear like a normal distribution curve. As the formulae for normal distribution and t- distribution are very complex and time-consuming to calculate, we instead compute the values of Z-score and T-score respectively.

### Chi – Square Distribution

Chi-square distribution is the distribution of the summation of the square of the random variables taken from a normal distribution. The degrees of freedom used in this distribution is equal to the number of variables taken from the normal distribution. The mean of a chi-square distribution is equal to the number of degrees of freedom.

This distribution is widely used in calculating the confidence intervals and in hypothesis testing. It is a specific case of gamma distribution. It is also used in the chi-square test which is the goodness of fit test for observed distribution which helps in indicating if the sample data is a good representation of the entire population.

## Continuous Probability Distribution Characteristics

You will come across multiple types of discrete distributions for several types of discrete data. For continuous data, you will come across three types of probability distributions. Every probability distribution comes with parameters that can provide knowledge about its shape.

Most probability distributions will come with one to three parameters. Specifying these parameters will help develop the shape of the distribution and its probabilities fully. These parameters define the essential properties of distribution like variability and central tendency.

The normal probability curve or the Gaussian distribution is popular for continuous data. This symmetric distribution can accommodate different phenomena, like IQ scores and human height. It also comes with two parameters, including the mean and the standard deviation.

The lognormal distribution or Weibull distribution is also quite common for continuous probability distributions. These distributions are useful for accommodating skewed data.

Distribution parameter values are applicable for whole populations. But unfortunately, popular parameters are usually unknown. It is hardly possible to measure a whole population. But you will be able to use random samples for estimating these parameters.

## Calculating Probabilities for Continuous Data

Probabilities for continuous data can be calculated over value ranges instead of single points. A probability reveals the chance of a value falling within an interval. This property can be easily demonstrated with the help of a probability distribution plot.

On a probability plot, the full area under the distribution curve is equivalent to 1. This concept is similar to how the sum of different probabilities should be one for discrete distributions. The proportionate area that comes under the curve within the value ranges across the X-axis reveals whether a value will fall within that range.

In the end, you won’t get an area under the curve with one single value. It reveals why the probability is equivalent to zero for individual values. Typically, reference tables and statistical tools are used to define the areas.

## Popular AI and ML Blogs & Free Courses

 IoT: History, Present & Future Machine Learning Tutorial: Learn ML What is Algorithm? Simple & Easy Robotics Engineer Salary in India : All Roles A Day in the Life of a Machine Learning Engineer: What do they do? What is IoT (Internet of Things) Permutation vs Combination: Difference between Permutation and Combination Top 7 Trends in Artificial Intelligence & Machine Learning Machine Learning with R: Everything You Need to Know AI & ML Free Courses Introduction to NLP Fundamentals of Deep Learning of Neural Networks Linear Regression: Step by Step Guide Artificial Intelligence in the Real World Introduction to Tableau Case Study using Python, SQL and Tableau

## Conclusion

This article gave an overview of a few examples of discrete and continuous types of distributions. These different distributions are used to serve different purposes, and each has its own assumptions.

Learn ML Course from the World’s top Universities. Earn Masters, Executive PGP, or Advanced Certificate Programs to fast-track your career.

Although in real-life situations, the assumptions of these distributions might not be fulfilled, but these distributions do assist in making important decisions for the organization.

If you’re interested to learn more about machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.  Blog Author
Director of Engineering @ upGrad. Motivated to leverage technology to solve problems. Seasoned leader for startups and fast moving orgs. Working on solving problems of scale and long term technology strategy.
Get Free Consultation

Select Course  Select  By clicking 'Submit' you Agree to

#### Machine Learning Skills To Master

1What distinguishes the binomial distribution from the normal distribution?

In a binomial distribution, there are no data points between any two given data points. This is in stark contrast to a normal distribution, which features discrete data points. A normal distribution is not discrete unlike the binomial distribution. A binomial distribution has a finite number of occurrences, whereas a normal distribution has an infinite number of occurrences. Even then, if the sample size is large enough, the form of the binomial distribution will resemble that of the normal distribution.

2What distinguishes the binomial distribution from the Bernoulli distribution?

The outcome of a single trial of an event is dealt with by the Bernoulli distribution, but the outcome of several trials of a single event is dealt with by the Binomial distribution. When the result of an event is required just once, the Bernoulli distribution is applied, but the Binomial distribution is used when the outcome is required several times.

3When there is uncertainty, how can we use probability distribution?

A probability space is a representation of our uncertainty about an experiment that includes a sample space of possible outcomes and a probability measure that estimates the likelihood of each event. In uncertainty analysis, the rectangular distribution is the most widely employed probability distribution. All outcomes are equally likely to occur in a rectangular distribution. You will have to divide your values by the square-root of 3 to convert your uncertainty contributors to standard deviation equivalents.

4What do you understand about the probability mass function?

The probability mass function can be described as a frequency function. It is useful for characterizing the allocation of a discrete random variable. It is defined on all R values, where it resorts to all the arguments of a real number. It does not represent the value of X if the argument is equivalent to zero or if the argument belongs to X. The value of the PMF always needs to be positive. The probability mass function often refers to the primary component of describing a discrete probability distribution. But it is not the same as the probability density function, which can result in distinct outcomes. It is the primary reason behind the usage of probability mass function in statistical modeling and computer programming. In other words, PMF can relate discrete events to the probabilities associated with their occurrence. The word “mass” is used for denoting probabilities concentrated on discrete events. PMF solutions range between numbers of discrete random variables. It utilizes various random variables that are discrete.

## Suggested Blogs

86268
AWS Projects & Topics Looking for AWS project ideas? Then you’ve come to the right place because, in this article, we’ve shared multiple AWS proj  