Hypothesis testing is a crucial aspect of any Statistical Analysis. However, there are a lot of things to be predefined so that the test we conduct can be as correct as possible. Here is where the concept of power comes into play and defines the heuristics of a Statistical Test.

By the end of this tutorial, you will know:

- Heuristics of Statistical Tests
- What is the Power of a test?
- What is the need for Power Analysis?
- How to carry out Power Analysis

**Heuristics of Statistical Tests**

Carrying out correct Statistical Tests upon several heuristics which need to be preset before conducting the test. It is highly important to set the right heuristics as these cannot be changed once the test is started. Let’s have a look at few of these.

**1. Significance Level and Confidence Interval**

Before starting any statistical test, a threshold of probability needs to be set. This threshold or significance level is called the Critical Value (alpha). The complete region under the probability curve beyond the alpha value is called the Critical Region.

The alpha value tells us how farther the sample data point (or the experimental point) must be from the null hypothesis(original mean point) before concluding that it is unusual enough to reject the null hypothesis. A common value of alpha that is used is 0.05 or 95% confidence interval.

**2. P-Value**

To evaluate whether the test results that we got are statistically significant or not, we compare the Critical Value (alpha) that we had set before the test with the P-Value of the test. The p-value is the probability of getting values as extreme or even more extreme as the value we are testing for.

**Our learners also read:** Learn Python Online for Free

**3. Type 1 & Type 2 Errors**

The Statistical Tests can never be 100% certain. There is always room for error and getting misled by the results. As discussed above, if we set an alpha value of 0.05, there is a confidence interval of 95%. Therefore, there is a 5% chance that the result you’ve got is incorrect and misleading. These incorrect results are what we call as errors. There are 2 types of error – Type 1 & Type 2.

The significance level value of 0.05 means that your statistical test will be 95% times correct. Which also means that there is a 5% chance of it being incorrect! That will be a case of you rejecting the null hypothesis when it was correct. This is an example of a Type 1 Error. And we can also say that alpha(**α**) is the probability of committing a Type 1 error.

It can also be a case when you conclude that the null hypothesis is true or accept it when it is false. Technically, we can never accept the null hypothesis. We can only fail to reject it. This is what we call a Type 2 Error. Similarly, the probability of you making a Type 2 error is given by Beta — **β**.

Read: Data Analysts: Top Skills & Tools to Master

**upGrad’s Exclusive Data Science Webinar for you –**

## Explore our Popular Data Science Courses

**What is the Power of a Statistical Test?**

Power of a test is the probability of correctly rejecting the Null Hypothesis when it is false. Or in other words, Power is inversely proportional to the probability of making a type 2 error. Therefore, Power = 1-**β. **For example, if we set the power to be 80%, then we mean that 80% of our statistical tests are correct and not the bogus ones. Therefore, the higher the power value, the lesser is the probability of committing a type 2 error.

But why can the results be bogus? This is because we are dealing with random samples here. And sometimes the sample that is taken is too far from the mean of the distribution and hence gives unrealistic results, forcing us to make incorrect decisions. The whole aim of Power Analysis is to prevent us from making these incorrect decisions.

**Are we P-Hacking?**

Let’s take up an example where we have made a vaccine for COVID-19 and we are very much sure that the vaccine will have significant results. We proceed to conduct a Statistical test to see if our belief holds true statistically as well. So set the alpha as 0.05 and carry out a test using 100 samples.

After the test, we get a P-value as 0.06. We see that it is so close to our alpha but not less than it so that we can safely reject the null hypothesis. It gets tempting to see what happens if we increase the samples and redo the test.

So we add 50 more samples and see that the P-Value now comes as 0.045. Did we just prove our vaccine to be statistically significant? NO! We just P-hacked as we increased the number of samples after we got the first result. Learn more about What is P-Hacking & How To Avoid It?

## Top Data Science Skills to Learn

SL. No | Top Data Science Skills to Learn | |

1 | Data Analysis Programs | Inferential Statistics Programs |

2 | Hypothesis Testing Programs | Logistic Regression Programs |

3 | Linear Regression Programs | Linear Algebra for Analysis Programs |

**What is Power Analysis?**

As we saw in the above example, we found that the sample size was small and we increased it later. This is wrong and should never be done. The sample size value should be preset before starting the test itself. But what value of sample size is right for us?

Let’s consider an example where we carry out multiple tests using sample size as just 1. Therefore, when we sample 1 data point randomly from the population, it can be either around the mean which correctly represents our data, or it can be also a lot far away from the mean and does not represent the data well.

The issue arises when we conduct statistical tests using these far off data points. The P-value that we will get will be incorrect. We now conduct another series of tests taking 2 as the sample size. Now even if one value is far off from the data mean, the other value which is on the other side of the distribution will pull the average of them to centre, hence reducing the effect of that far off value. Therefore, with a sample size of 2, our results will more true with correct P-Values.

Power Analysis is the technique used to find out the right amount of sample size that is needed to conduct tests as well as possible. Higher the Power that we need more is the amount of sample size that will be required. So you might think that why not just take a large sample size because a large sample size means better and more trustable results. This is not right as collecting data is costly and knowledge of the sample size required is essential.

**Power Analysis In Statistics: Why Is It Needed?**

Researchers can prevent both type I and type II errors by using **the power of statistical test**. If a doctor told a man he was pregnant, that would be a type I error since it would accept the null hypothesis when it is untrue. Telling a pregnant lady she is not pregnant is a type II error since it is a false negative. Both mistakes can be very troublesome.

For small impact sizes and 45% for medium impact sizes, respectively, type II errors are reported to be likely in business research. The unintended consequences may be disastrous for a streaming service or a cosmetics business. Saying a product is harmless when it actually causes skin damage is a type I error in the context of the cosmetics industry.

A type II error implies that the formulation is detrimental when it is not. Thus, releasing a dangerous substance that causes skin rashes constitutes a kind I error. Saying the formulation is harmful when it isn’t is a type II error that wastes all the resources, time, and funds invested in the study and formulation.

**How to carry out Power Analysis?**

The power of a test depends on some factors. The first step to carry out a power analysis is to set a Power Value. Consider that you set a common power of 0.8, meaning that you want to have at least an 80% chance of correctly rejecting the null hypothesis. If we are validating the effect of COVID-19 vaccine on a set of people, we want to prove that the distribution of data points of vaccinated people is different from that of people that were given a placebo.

**1. Amount of overlap**

We need to consider the amount of overlap between the two distributions we are comparing. More the overlap, more difficult it will be for us to safely reject the null and hence we’ll need more sample size. However, if the overlap is very less, then we can quite easily safely reject the null. And we’d require quite less sample size. Overlap depends on the distance between the means of the two distributions and their standard deviations.

**2. Effect size**

Effect size is a way to combine the effects of the difference between the means and the standard deviations of the populations. Effect size (d) is calculated as The estimated difference between the means divided by Pooled estimated standard deviations. One of the simplest ways to calculate Pooled estimated Standard Deviations is Square root of the squared sum of Standard deviations divided by 2.

So once we have Power value, alpha value and the effect size, we can plug these values into a Statistics Power Calculator and get the sample size value. Such a Statistics Power Calculator is easily available on the internet.

Get data science certification from the World’s top Universities. Learn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.

**Advantages of Power Analysis In Statistics**

**Statistical Prominence**

**Power analysis in statistics** provides the apparent advantage of verifying that the research or study’s findings have statistical significance—that is, the conclusion reached is accurate and cannot be explained by chance.

**Minimal Harm or Risk**

Recognizing that a study’s findings are probably accurate reduces the potential of harm to users, particularly if the findings concern actual people. A large amount of power ensures that no type I error will be returned from the study’s findings.

**Accountability For Error Inside a Sample Population**

If a business intends to launch a new feature, it can do testing and reasonably be certain that the outcome is accurate. For businesses to gain more control over the outcome, power analysis considers populations and subgroups within the larger groups.

An expensive error could be made by the company, for example, if a section of the population is not included, the quantity of surveys is insufficient, or the variety of questions is too narrow.

**Factors To Consider For Power Analysis In Statistics**

Before beginning any kind of inquiry, it is necessary to evaluate the following three factors that power analysis considers:

**Sampling Population**

Typically, **statistical power calculations** of the sample size necessitate a normal Gaussian (bell-curve-shaped) population distribution. Consider subpopulation differences in intricate evaluations and designs like stratified random sampling. Otherwise, it is impossible to anticipate that populations will vary.

**Sample Size**

The required **power analysis for sample size** depends on the statistical analysis type. Only a “reasonable” sample size is needed for descriptive statistics. However, a higher sample size becomes essential for multiple regression or log-linear analysis.

**Error-Rate Compensation**

Yes, the sample size has no choice but to satisfy the criteria. Besides that, it also must be large enough to account for individuals eliminated from the sample by researchers. It happens when:

- The outcomes’ recording had mistakes
- The experiment wasn’t conducted with proper caution and attention
- The samples are extreme outliers.

## Read our popular Data Science Articles

**Before you go**

We calculated the sample size by carrying out Power Analysis using Power, alpha and effect size. So if we got a sample size value of 7, it will mean that we need a sample size of 7 to have an 80% chance of correctly rejecting the Null Hypothesis. Having the right amount of domain expertise is also crucial to estimate the population means and their overlaps and the power required.

If you are curious to learn about data science, check out IIIT-B & upGrad’s PG Diploma in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.