Cornell University Ergonomics Web
 CUErgo Statistics Helper: Sample Size

Sample Size Calculations

Before running an experiment or initiating a study, it is important to think about the sample size that you will need to detect the effect in which you are interested. If the sample size is too small for your study, you might not be able to detect a significant effect during the analysis even if that effect is truly present. Indeed, if an effect comes out as being not significant, an important question before drawing your conclusions is to determine if this effect was truly not present, or whether you were simply not able to detect it because of insufficient data.

For a particular study, the required sample size will be jointly determined by the power, the alpha-level, the smallest meaningful difference and the variability expected to be present in the data that you collect. The alpha-level is the probability of rejecting the null hypothesis when it is actually true. That is, it is the probability of incorrectly finding a significant effect. The power of a study is the ability to reject the null hypothesis when the alternative hypothesis is true. That is, it is the probability of correctly finding a significant effect. The smallest meaningful difference is the smallest difference that you would judge to be of substantial importance in your study. For example, in studying the effect of different nutrition supplements on the weight gain of young children, would a weight increase of 10 grams after the treatment be of substantial importance or would a weight increase of 200 grams only be deemed of interest?

The challenge in sample size calculations is to properly choose or estimate each of these four components. For the power, one will usually choose the standard levels of 80 to 90 % and for the alpha-level, one will usually choose 0.05. You can obtain an estimate of the variability that you might expect in your experiment or study from previous experiments or from the literature. To choose the smallest meaningful difference that you would like to detect is often the hardest component and will mostly require thoughtful consideration on your part.

Many commonly used statistical packages such as Minitab, SAS and STATA, offer sample size calculations and power analysis for basic designs such as t-test, proportions or analysis of variance. These are usually straightforward to use. For more sophisticated designs such as repeated measures, regressions or survival analysis, several stand-alone packages offer the ability to calculate the sample size or the power requirements. A comprehensive list of power analysis software can be found at http://www.forestry.ubc.ca/conservation/power/index.html. A review article comparing the existing software can be found at http://www.zoology.ubc.ca/~krebs/power.html.

At the Office of Statistical Consulting, beside the standard statistical packages, we have a comprehensive power analysis software package that you are welcome to use.

If you need access to power analysis software or would like any additional input or help, feel free to contact us.

Author: Francoise Vermeylen (fmv1@cornell.edu.), October, 2000.

(This article was reproduced with permission from the newsletter that is distributed to faculty and graduate students in the Division of Nutritional Sciences and the College of Human Ecology, and faculty in the College of Agriculture and Life Sciences, by the Office of Statistical Consulting, Cornell University. Please forward it to any interested colleagues, students, and research staff. Anyone not receiving this newsletter who would like to be added to the mailing list for future newsletters should contact Francoise Vermeylen at fmv1@cornell.edu. Information about the Office of Statistical Consulting and copies of previous newsletters can be obtained at World Wide Web address http://www.human.cornell.edu/Admin/StatCons/.)


Return to CUErgo Statistics Helper Home Page