The greater the difference between compared quantities and the more observations we have, the more confident we (the t-test) are that the observed differences are not just due to a random chance but are true, statistically significant differences. Even if the means of two populations are different, the t-test might not detect it if the difference or the sample is small. The probability with which the t-test would detect the difference under the given sample size and variability is the power of the test and can be calculated in R. We prefer high power and often use the desired power, confidence level, and expected variability to identify the required sample size.
Calculating The Power Of A Test
Here we look at some examples of calculating the power of a test. The examples are for both normal and t distributions. We assume that you can enter data and know the commands associated with basic probability. All of the examples here are for a two sided test, and you can adjust them accordingly for a one sided test.
Calculating The Power Using a Normal Distribution
Here we calculate the power of a test for a normal distribution for a specific example. Suppose that our hypothesis test is the following:
We can fail to reject the null hypothesis if the sample happens to be within the confidence interval we find when we assume that the null hypothesis is true. To get the confidence interval we find the margin of error and then add and subtract it to the proposed mean, a, to get the confidence interval. We then turn around and assume instead that the true mean is at a different, explicitly specified level, and then find the probability a sample could be found within the original confidence interval.
In the example below the hypothesis test is for
We will assume that the standard deviation is 2, and the sample size is 20. In the example below we will use a 95% confidence level and wish to find the power to detect a true mean that differs from 5 by an amount of 1.5. (All of these numbers are made up solely for this example.) The commands to find the confidence interval in R are the following:
> a <- 5 > s <- 2 > n <- 20 > error <- qnorm(0.975)*s/sqrt(n) > left <- a-error > right <- a+error > left [1] 4.123477 > right [1] 5.876523
Next we find the Z-scores for the left and right values assuming that the true mean is 5+1.5=6.5:
> assumed <- a + 1.5 > Zleft <- (left-assumed)/(s/sqrt(n)) > Zright <-(right-assumed)/(s/sqrt(n)) > p <- pnorm(Zright)-pnorm(Zleft) > p [1] 0.08163792
The probability that we make a type II error if the true mean is 6.5 is approximately 8.1%. So the power of the test is 1-p:
> 1-p [1] 0.918362
In this example, the power of the test is approximately 91.8%. If the true mean differs from 5 by 1.5 then the probability that we will reject the null hypothesis is approximately 91.8%.
Calculating The Power Using a t Distribution
Calculating the power when using a t-test is similar to using a normal distribution. One difference is that we use the command associated with the t-distribution rather than the normal distribution. Here we repeat the test above, but we will assume that we are working with a sample standard deviation rather than an exact standard deviation. We will explore three different ways to calculate the power of a test. The first method makes use of the scheme many books recommend if you do not have the non-central distribution available. The second does make use of the non-central distribution, and the third makes use of a single command that will do a lot of the work for us.
In the example the hypothesis test is the same as above,> a <- 5 > s <- 2 > n <- 20 > error <- qt(0.975,df=n-1)*s/sqrt(n) > left <- a-error > right <- a+error > left [1] 4.063971 > right [1] 5.936029
The number of observations is large enough that the results are quite close to those in the example using the normal distribution. Next we find the t-scores for the left and right values assuming that the true mean is 5+1.5=6.5:
> assumed <- a + 1.5 > tleft <- (left-assumed)/(s/sqrt(n)) > tright <- (right-assumed)/(s/sqrt(n)) > p <- pt(tright,df=n-1)-pt(tleft,df=n-1) > p [1] 0.1112583
The probability that we make a type II error if the true mean is 6.5 is approximately 11.1%. So the power of the test is 1-p:
> 1-p [1] 0.8887417
In this example, the power of the test is approximately 88.9%. If the true mean differs from 5 by 1.5 then the probability that we will reject the null hypothesis is approximately 88.9%. Note that the power calculated for a normal distribution is slightly higher than for this one calculated with the t-distribution.
Another way to approximate the power is to make use of the non-centrality parameter. The idea is that you give it the critical t scores and the amount that the mean would be shifted if the alternate mean were the true mean. This is the method that most books recommend.
> ncp <- 1.5/(s/sqrt(n)) > t <- qt(0.975,df=n-1) > pt(t,df=n-1,ncp=ncp)-pt(-t,df=n-1,ncp=ncp) [1] 0.1111522 > 1-(pt(t,df=n-1,ncp=ncp)-pt(-t,df=n-1,ncp=ncp)) [1] 0.8888478
Again, we see that the probability of making a type II error is approximately 11.1%, and the power is approximately 88.9%. Note that this is slightly different than the previous calculation but is still close.
Finally, there is one more command that we explore. This command allows us to do the same power calculation as above but with a single command.
> power.t.test(n=n,delta=1.5,sd=s,sig.level=0.05, type="one.sample",alternative="two.sided",strict = TRUE) One-sample t test power calculation n = 20 delta = 1.5 sd = 2 sig.level = 0.05 power = 0.8888478 alternative = two.sided
This is a powerful command that can do much more than just calculate the power of a test. For example it can also be used to calculate the number of observations necessary to achieve a given power. For more information check out the help page, help(power.t.test).
Calculating Many Powers From a t Distribution
Suppose that you want to find the powers for many tests. This is a common task and most software packages will allow you to do this. Here we see how it can be done in R. We use the exact same cases as in the previous chapter.
Here we assume that we want to do a two-sided hypothesis test for a number of comparisons and want to find the power of the tests to detect a 1 point difference in the means. In particular we will look at three hypothesis tests. All are of the following form:We have three different sets of comparisons to make:
Comparison 1 | |||
Mean | Std. Dev. | Number (pop.) | |
Group I | 10 | 3 | 300 |
Group II | 10.5 | 2.5 | 230 |
Comparison 2 | |||
Mean | Std. Dev. | Number (pop.) | |
Group I | 12 | 4 | 210 |
Group II | 13 | 5.3 | 340 |
Comparison 3 | |||
Mean | Std. Dev. | Number (pop.) | |
Group I | 30 | 4.5 | 420 |
Group II | 28.5 | 3 | 400 |
For each of these comparisons we want to calculate the power of the test. For each comparison there are two groups. We will refer to group one as the group whose results are in the first row of each comparison above. We will refer to group two as the group whose results are in the second row of each comparison above. Before we can do that we must first compute a standard error and a t-score. We will find general formulae which is necessary in order to do all three calculations at once.
We assume that the means for the first group are defined in a variable called m1. The means for the second group are defined in a variable called m2. The standard deviations for the first group are in a variable called sd1. The standard deviations for the second group are in a variable called sd2. The number of samples for the first group are in a variable called num1. Finally, the number of samples for the second group are in a variable called num2.
With these definitions the standard error is the square root of (sd1^2)/num1+(sd2^2)/num2. The R commands to do this can be found below:
> m1 <- c(10,12,30) > m2 <- c(10.5,13,28.5) > sd1 <- c(3,4,4.5) > sd2 <- c(2.5,5.3,3) > num1 <- c(300,210,420) > num2 <- c(230,340,400) > se <- sqrt(sd1*sd1/num1+sd2*sd2/num2)
To see the values just type in the variable name on a line alone:
> m1 [1] 10 12 30 > m2 [1] 10.5 13.0 28.5 > sd1 [1] 3.0 4.0 4.5 > sd2 [1] 2.5 5.3 3.0 > num1 [1] 300 210 420 > num2 [1] 230 340 400 > se [1] 0.2391107 0.3985074 0.2659216
Now we need to define the confidence interval around the assumed differences. Just as in the case of finding the p values in previous chapter we have to use the pmin command to get the number of degrees of freedom. In this case the null hypotheses are for a difference of zero, and we use a 95% confidence interval:
> left <- qt(0.025,df=pmin(num1,num2)-1)*se > right <- -left > left [1] -0.4711382 -0.7856092 -0.5227825 > right [1] 0.4711382 0.7856092 0.5227825
We can now calculate the power of the one sided test. Assuming a true mean of 1 we can calculate the t-scores associated with both the left and right variables:
> tl <- (left-1)/se > tr <- (right-1)/se > tl [1] -6.152541 -4.480743 -5.726434 > tr [1] -2.2117865 -0.5379844 -1.7945799 > probII <- pt(tr,df=pmin(num1,num2)-1) - pt(tl,df=pmin(num1,num2)-1) > probII [1] 0.01398479 0.29557399 0.03673874 > power <- 1-probII > power [1] 0.9860152 0.7044260 0.9632613
The results from the command above should give you the p-values for a two-sided test. It is left as an exercise how to find the p-values for a one-sided test.
Just as was found above there is more than one way to calculate the power. We also include the method using the non-central parameter which is recommended over the previous method:
> t <- qt(0.975,df=pmin(num1,num2)-1) > t [1] 1.970377 1.971379 1.965927 > ncp <- (1)/se > pt(t,df=pmin(num1,num2)-1,ncp=ncp)-pt(-t,df=pmin(num1,num2)-1,ncp=ncp) [1] 0.01374112 0.29533455 0.03660842 > 1-(pt(t,df=pmin(num1,num2)-1,ncp=ncp)-pt(-t,df=pmin(num1,num2)-1,ncp=ncp)) [1] 0.9862589 0.7046655 0.9633916
Source: K. Black, https://www.cyclismo.org/tutorial/R/power.html# This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 License.