Blog_Banner_Asset
    Homebreadcumb forward arrow iconBlogbreadcumb forward arrow iconData Sciencebreadcumb forward arrow iconWhat is Hypothesis Testing in Statistics? Types, Function & Examples

What is Hypothesis Testing in Statistics? Types, Function & Examples

Last updated:
5th Jun, 2023
Views
Read Time
9 Mins
share image icon
In this article
Chevron in toc
View All
What is Hypothesis Testing in Statistics? Types, Function & Examples

Decisions are always data-based in a world that generates humongous amounts of data each day. Whether in the health sector, quality improvement, or the academic sector, hypothesis plays a vital role, without which the probability of arriving at wrong conclusions increases. 

In this blog, we explore hypothesis testing, how it helps us examine assumptions and their applications in various sectors with examples. 

Meaning of Hypothesis Testing in Statistics

Hypothesis testing in statistics is a statistical analysis that helps estimate the relationship between statistical variables. Also known as significance testing, analysts use it to test assumptions about a population parameter. 

The method of analysis an analyst selects is based on the nature of the data and his choice. 

The Formula for Hypothesis Testing 

The formula for hypothesis testing is Z = (Sample mean – Population mean) / (Standard deviation /√n), where n denotes the sample size.

An assumption is first made on the distribution, known as the null hypothesis. An alternate hypothesis is defined as the opposite of a null hypothesis. The analyst has to determine whether the null hypothesis can be proved or has to be rejected. 

How Does Hypothesis Testing Work?

An analyst uses hypothesis testing to test a statistical sample. It provides evidence for the plausibility of the null hypothesis. For significance testing, analysts incorporate the use of a random population sample. 

Hypothesis testing analyses the data sample to arrive at the most meaningful conclusions. There are two types of hypotheses — null and alternative. These two statements are opposite, indicating only one of the two hypotheses must be true. 

Hypothesis testing tests the authenticity of an assumption about a data set through different methodologies. The main motive is to determine whether the null hypothesis has to be rejected. Learn more about this with a Master of Science in Data Science from LJMU

You can also check out our free courses offered by upGrad under Data Science

What are the Null Hypothesis and Alternative Hypothesis 

Null Hypothesis

According to the null hypothesis definition, it is a mathematical statement indicating no difference between the characteristics of two possibilities. It also assumes that the result of an experiment depends on chance alone and is denoted as H0.  

Alternative Hypothesis

An alternative hypothesis is a substitute for the null hypothesis. It denotes a statistical significance between two possible outcomes. Unlike the null hypothesis, it establishes a relationship between the variables under study and is denoted as H1. 

Now that you know the alternative and null hypothesis meanings, let us understand this with an alternative and null hypothesis example. In this null and alternative hypothesis example, we determine the probability that a coin is flipped 100 times. There are 50% chances of landing at heads and 50% at tales. 

Here, 

  • Null Hypothesis: The coin would land at heads 50% of the time
  • Alternative Hypothesis: The alternate hypothesis would be that the coin would not land at heads 50% of the time

Learn data science certification course from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.

What Are Confidence Intervals?

A confidence interval refers to a range of possible values within which an unknown parameter is likely to fall while considering a certain confidence level or probability. 

For example, if there is a 95% confidence interval, there is a set of parameter values where one cannot reject the null hypothesis at 1-0.95, that is, 5% test size. 

Simple and Composite Hypothesis Testing

Based on the statistical distribution, hypothesis testing can be divided into the following types:

Simple- In a simple hypothesis, an exact value is specified for the population parameter, which makes the analysis easier.

Composite- In the composite hypothesis, a range contains a lower and upper value between which the population parameter lies.

Read our popular Data Science Articles

One-Tailed and Two-Tailed Hypothesis Testing

One-tailed hypothesis testing

It is also known as directional hypothesis testing. One-tailed hypothesis testing is conducted when most of the population is concentrated towards one side, and the rejection region lies in a single direction. One-tailed hypothesis testing is further classified into right-tailed and left-tailed tests. 

  1. Right-tailed test: This is also known as the upper tail test and is used for checking whether a population parameter is greater than an appointed value. 
  2. Left-tailed test: Left-tailed hypothesis test is also known as the lower-tail test. This test determines if the population parameter is less than a prescribed value. 

Two-tailed hypothesis testing

It is also known as a non-directional testing method. This method distributes the critical region on both sides of the population. 

Type 1 and Type 2 Error

Statistical decisions are also prone to many uncertainties. Therefore, one cannot rule out the possibility of making errors while testing a hypothesis. 

This can be explained with the help of two terms- Type 1 error and Type 2 error. 

  • Type 1 error denotes a false positive conclusion 
  • Type 2 error denotes a false negative conclusion 

Hypothesis Testing Types

  • Chi-Square: This test in hypothesis testing is used to determine whether the variables of a given population are independent.
  • T-Test: The T-test is used for a sample size of less than 30. The T-test helps compare the population mean and sample mean. However, here, the population’s standard deviation is unknown, while the standard deviation of the sample is known. 
  • Z Test: The Z test is used while analysing a data set with a sample number more than or equal to 30. With the help of the Z test, an analyst can determine if there is any difference between the sample mean, and the population mean when the population’s standard deviation is known.

Choosing the correct methodology for hypothesis testing can be challenging. Analysts may reject or admit the null hypothesis based on the method used. 

Significance Level

As mentioned earlier, there is a constant chance of error while eliminating the null hypothesis. It can be assumed as a probability of Type 1 error which statisticians assume even before data collection. 

The significance level determines the extent to which the result obtained through the null hypothesis can be false. Analysts have to provide more substantial evidence when the significance level is low. 

P-Value

P-value is the shortened form of probability value. It shows the probability of the occurrence of a specific event. It refers to the probability of obtaining a result that may be the same or more extreme than the actual observations. 

If the P-value is small, statisticians must provide more substantial evidence in favour of the alternative hypothesis

Hypothesis Testing Real-World Example

Let’s understand hypothesis testing with this simple example. A pharmaceutical company may have developed a new drug to treat a medical condition. In this case, the null and alternative hypotheses would be: 

Null hypothesis: The drug has no significant effect on treating the symptoms of the medical condition. 

Alternative hypothesis: The drug significantly affects treating the symptoms of the medical condition. 

For this hypothesis testing, the pharmaceutical company will have to conduct a randomised controlled trial, where people having the given medical condition will be divided into two groups. One group will receive the drug, and the other will follow the conventional mode of treatment. 

For data collection, the analyst has to quantify the effects of the medicine in different forms, such as pain scores, etc. After analysing the data, the null hypothesis will be rejected if there is a difference between how the two groups respond. 

Therefore, it will be assumed that the medicine is effective in treating the symptoms of the medical condition. 

An Example to Show Hypothesis Testing Calculation 

In this instance, a hypothesis test is conducted to check the average height of women in the US. 

The null hypothesis states that the average height of the women is 5’4” (population mean). A sample of 100 women is gathered, and their average height is found to be 5’5” (sample mean). The SD for the given sample is 2. 

Since the sample size is more than 30, we will conduct the Z test using the formula: (sample mean – population mean) / (SD/ Sqrt n). Therefore, the value for Z thus obtained is 11.11.

Since the value of Z is substantial, we can conclude that there is enough evidence to support that women in the US are taller than 5’4” on average. 

Explore our Popular Data Science Courses

Hypothesis Testing Steps

Listed below are the steps to be followed for hypothesis testing

Step 1: Specify your null and alternative hypotheses

Analysts must first define the null hypothesis and alternative hypothesis

Step 2: Gather data

After the null and alternative hypotheses have been defined, the next step is collecting the data for sampling. The data should represent the entire population the analyst is testing.

Step 3: Conduct a statistical test

After data collection, the analyst must choose a suitable method or statistical test for the collected data. 

Step 4: Determine the rejection of your null hypothesis

Based on the results obtained through the test and the significance level, the null hypothesis is either accepted or rejected. 

Step 5: Present your results 

After successful completion of all the steps, the results are compiled and presented in the form of a report. 

Importance of Hypothesis Testing in Research Methodology

In research methodology, while dealing with vast amounts of data, hypothesis testing helps decide the occurrence of an event. It also helps determine statistically whether any medical treatment has any positive effect, the extent to which two groups differ, and so on. 

Simply put, hypothesis testing helps to prove whether data is statistically significant. Combined with AI & ML, it can help make and execute effective data-driven decisions. You can opt for an Executive PG Programme in Data Science from IIIT Bangalore to learn more about these innovative technologies.

Hypothesis Testing Limitations

Some of the limitations associated with hypothesis testing are as follows:

  1. It is important to remember that hypothesis testing does not confirm that a decision can be taken based on the result obtained. It is essential to interpret the result correctly. 
  2. The hypothesis test result does not explain the reason for the differences. 
  3. The outcome of the significance tests depends on probabilities. One cannot vouch for them with total certainty. 

Top Data Science Skills to Learn

Conclusion

Hypothesis testing has wide applicability in various fields, from biology, psychology, finance, and medicine to marketing and product development. Having a grasp of this method can help budding data scientists explore career opportunities in other fields.

A Post Graduate Diploma in Data Science & AI from WOOLF can help you understand the concepts of these evolving technologies. It will also help you understand the practical application of these concepts to develop the most accurate data-driven decisions. 

Profile

Rohit Sharma

Blog Author
Rohit Sharma is the Program Director for the UpGrad-IIIT Bangalore, PG Diploma Data Analytics Program.

Frequently Asked Questions (FAQs)

1How does hypothesis testing help in making data-driven decisions?

Professionals can test their assumptions and theories using hypothesis testing and make critical decisions based on the same.

2How do you determine the appropriate significance level for a hypothesis test?

Typically, significance levels are preset to 0.01, 0.05, and 0.10. However, the 5% significance level, 0.05, is commonly used for hypothesis testing.

3What is the difference between one-tailed and two-tailed tests in hypothesis testing?

The difference between the two tests is that while the one-tailed test has one critical region, the two-tailed test has two. In a one-tailed hypothesis, the population is concentrated on one side, while in a two-tailed hypothesis, the critical region is distributed on two sides.

4How does hypothesis testing apply to A/B testing in marketing or website optimisation?

A/B testing is also known as split testing. In marketing and website optimisation, the target audience is split into groups and then subjected to different campaigns to decide and assess the performance of each. An analyst must accept or reject the null or alternative hypothesis to infer appropriately.

Explore Free Courses

Suggested Blogs

Priority Queue in Data Structure: Characteristics, Types & Implementation
57468
Introduction The priority queue in the data structure is an extension of the “normal” queue. It is an abstract data type that contains a
Read More

by Rohit Sharma

15 Jul 2024

An Overview of Association Rule Mining & its Applications
142458
Association Rule Mining in data mining, as the name suggests, involves discovering relationships between seemingly independent relational databases or
Read More

by Abhinav Rai

13 Jul 2024

Data Mining Techniques & Tools: Types of Data, Methods, Applications [With Examples]
101688
Why data mining techniques are important like never before? Businesses these days are collecting data at a very striking rate. The sources of this eno
Read More

by Rohit Sharma

12 Jul 2024

17 Must Read Pandas Interview Questions & Answers [For Freshers & Experienced]
58119
Pandas is a BSD-licensed and open-source Python library offering high-performance, easy-to-use data structures, and data analysis tools. The full form
Read More

by Rohit Sharma

11 Jul 2024

Top 7 Data Types of Python | Python Data Types
99373
Data types are an essential concept in the python programming language. In Python, every value has its own python data type. The classification of dat
Read More

by Rohit Sharma

11 Jul 2024

What is Decision Tree in Data Mining? Types, Real World Examples & Applications
16859
Introduction to Data Mining In its raw form, data requires efficient processing to transform into valuable information. Predicting outcomes hinges on
Read More

by Rohit Sharma

04 Jul 2024

6 Phases of Data Analytics Lifecycle Every Data Analyst Should Know About
82806
What is a Data Analytics Lifecycle? Data is crucial in today’s digital world. As it gets created, consumed, tested, processed, and reused, data goes
Read More

by Rohit Sharma

04 Jul 2024

Most Common Binary Tree Interview Questions & Answers [For Freshers & Experienced]
10477
Introduction Data structures are one of the most fundamental concepts in object-oriented programming. To explain it simply, a data structure is a par
Read More

by Rohit Sharma

03 Jul 2024

Data Science Vs Data Analytics: Difference Between Data Science and Data Analytics
70274
Summary: In this article, you will learn, Difference between Data Science and Data Analytics Job roles Skills Career perspectives Which one is right
Read More

by Rohit Sharma

02 Jul 2024

Schedule 1:1 free counsellingTalk to Career Expert
icon
footer sticky close icon