Compute sample size and power for a logistic regression in Excel

This tutorial explains how to calculate the sample size and power for a logistic regression in Excel using XLSTAT.

What is the power of a statistical test?

When testing a hypothesis using a statistical test, there are several decisions to take:

The type II error or beta is less studied but is of great importance. In fact, it represents the probability that one does not reject the null hypothesis when it is false. We can not fix it upfront, but based on other parameters of the model we can try to minimize it. The power of a test is calculated as 1-beta and represents the probability that we reject the null hypothesis when it is false.

We therefore wish to maximize the power of the test. XLSTAT calculates the power (and beta) when other parameters are known. For a given power, it also allows to calculate the sample size that is necessary to reach that power.

The statistical power calculations are usually done before the experiment is conducted. The main application of power calculations is to estimate the number of observations necessary to properly conduct an experiment.

Goal of this tutorial

We want to study the relationship between cancer occurence and a diet program. The baseline probability of occurrence of this cancer is 6%. It is assumed that the odds ratio is 2, and that 25% of people in the sample are following this particular diet program.

In this tutorial, we will find out what is the right sample size to perform this study in order to obtain a power of 0.9.

Setting up the sample size calculation for a logistic regression

Once XLSTAT has been launched, click on the Power icon and choose Logistic regression.

Power calculation for Logistic regression in the XLSTAT Menu

Once the button has been clicked, the dialog box pops up.

You must then choose the Find sample size objective.

The alpha is 0.05. The desired power is 0.9.

As explained earlier, we take a reference probability of 0.06 and an odds ratio of 2.

General tab for the power calculation for logistic regression in XLSTAT

In the Chart tab, the simulation plot option is activated and the size of sample 1 will be represented on the vertical axis and the power on the horizontal axis.

Chart tab for the power calculation for logistic regression in XLSTAT

The power varies between 0.8 and 0.95 in intervals of 0.01.

Once you click on the OK button, the calculations are done and then the results are displayed.

Interpret the results of sample size calculations for a logistic regression

The first table gathers the parameters used as input.

Table returned by the sample size calculation for logistic regression in XLSTAT

The second table gathers the results of the calculation as well as an interpretation.

Results returned by the sample size calculation for logistic regression in XLSTAT

We see that 3011 observations are needed per sample to obtain a power as close as possible to 0.9.

The following table gathers the calculations obtained for each value of the power between 0.8 and 0.95.

Sample size depending on power values in XLSTAT for logistic regression


The simulation plot shows the evolution of the sample size as a function of the power. We see that for a power of 0.8, 2190 observations are enough and that for a power of 0.95 we arrive at 3788 observations.

Graph representing the sample size depending on power

XLSTAT is therefore a powerful tool both for finding the sample size required for an analysis and for calculating the power of a test. If we are not sure of our sample parameters, it is also possible to restart the analysis with a different odds ratio or a percentage of N with X1 = 1 which can reassure us about the sample size to choose.

Was this article useful?