Sample Size Determination in an Epidemiologic Study using the EpiTools Web-Based Calculator

Romeo L. Villarta, Jr. and Abubakar S. Asaad

Department of Epidemiology and Biostatistics, College of Public Health,
University of the Philippines Manila

Introduction

The adequacy of sample size is one of the important issues to be considered in a research study design.1 Estimation of the sample size (including the assumptions made in the calculations) should be an integral part of the research proposal. Reviews of various published studies have shown that sample size calculations were not reported properly.2,3,4,5 This raises questions about the validity of the results of these studies. One possible reason for this improper reporting is that some decisions regarding sample size computation are made arbitrarily on the basis of convenience, available resources or the number of easily available subjects.6

Sample size is the number of observations or subjects selected in a research study. The sample size should be adequate in order to prove that the magnitude of the effect being measured is scientifically as well as statistically significant.7 An inadequate sample may result in reliability and validity issues. An oversized sample may lead to waste of resources and may expose an unnecessarily large number of subjects to potentially harmful intervention or may deny/delay the application of a potentially beneficial intervention.8,9,10

For quantitative research studies, an assumption is made that the selection of subjects or observations be based on probability sampling. For each study design, there is a corresponding statistical method of sample size calculation. Consultation with a biostatistician may not always be available to assist the researcher in estimating sample size. However, the availability of free web-based sample size calculators offers the researcher another option for calculating the adequate sample size for relatively simple research study designs. In this paper, the use of EpiTools,11 a free web-based sample size calculator, will be reviewed.

Methods

EpiTools is an epidemiological calculators project created by AusVet Animal Health Services with the support of The Australian Biosecurity Cooperative Research Centre for Emerging Infectious Disease. It provides a range of epidemiological tools for the use of researchers and epidemiologists. This web-based tool is available at http://epitools.ausvet.com.au.10

As of this review, EpiTools has sample size calculators for eight research designs:

  1. Single Proportion
  2. Single Mean
  3. Two Proportions
  4. Two Means with equal sample size and equal variances
  5. Two Means with unequal sample size and unequal variances
  6. True Prevalence
  7. Cohort Study
  8. Case-Control Study

In using EpiTools sample size calculators, there are certain assumptions that must be entered prior to calculation. Among the more common assumptions that have to be identified are the following:

There are other assumptions needed per sample size calculation for each study design. The estimates for these assumptions may be based on previous studies, pilot studies or the data expected by the researcher.

Results

1. Single Proportion

This utility calculates the sample size required to estimate a proportion (prevalence) with a specified level of confidence and precision.

2. Single Mean

This utility calculates the sample size required to estimate a population mean with a specified level of confidence and precision.

3. Two Proportions

This utility calculates the sample size required to detect a statistically significant difference between two proportions with specified levels of confidence and power. A summary table of sample sizes for a range of assumed prevalence values is also reported.

4. Two Means with equal sample size and equal variances

This utility calculates the sample size required to detect a statistically significant difference between two sample means with specified levels of confidence and power, assuming equal sample sizes and equal variances.

5. Two Means with unequal sample size and unequal variances

This utility calculates the sample size required to detect a statistically significant difference between two sample means with specified levels of confidence and power, assuming unequal variances and allowing for unequal sample sizes between groups.

6. True prevalence with an imperfect test

This calculator is useful in estimating sample size for diagnostic test studies. The utility calculates the sample size required to estimate true prevalence with a specified level of confidence and precision, assuming a test with imperfect sensitivity and/or specificity. Tables of sample sizes for a range of values for prevalence and precision and for sensitivity and specificity are also produced.

7. Cohort Study

This utility calculates the sample size required for a cohort study, with specified levels of confidence and power and cohorts of equal size. A summary table of sample sizes for a range of assumed incidence values and relative risks is also reported.

Example: Two competing therapies for a particular cancer are to be evaluated by a cohort study in a multicenter clinical trial. Patients will be randomized to either treatment A or treatment B and will be followed for 5 years after treatment for recurrence of the disease. Treatment A is the old therapy while treatment B is the new therapy. Treatment B will be widely used if it can be demonstrated that the relative risk of recurrence of A compared to B in the first 5 years after treatment is 2.0. Patients receiving treatment B have a 35% recurrence. How many patients should be studied in each of the two treatment groups if the investigators wish to be 90% confident of correctly rejecting the null hypothesis.

8. Case-Control Study

This utility calculates the sample size required for a case-control study, with specified levels of confidence and power and case and control groups of equal size. A summary table of sample sizes for a range of assumed proportions exposed and odds ratios is reported.

In summary, the computations of sample size needed for several simple epidemiologic study designs were calculated using different assumptions. The calculator was straightforward to use and user-friendly. The results were calculated quickly. Comparison of the computed sample size using different assumptions may be done to assist in evaluating research project feasibility. The input data and output of the sample size calculation may be transformed into a report for inclusion in the written research proposal.

One issue that arises from using web-based or online sample size calculators would be the question of its accuracy.12 Further studies should be conducted in order to evaluate the accuracy of the results of EpiTools and other web-based sample size calculators. Comparison with results using other web-based calculators and standard formulas for sample size calculations should be conducted.

Conclusion

The EpiTools web-based calculator is a convenient tool for sample size determination in the design of the research protocols with relatively simple study designs. It may be used in evaluating the feasibility of the computed sample size needed by the study design during the preparation of the research proposal.

References:

  1. Lemeshow S, Hosmer D, Klar J, Lwanga S. Adequacy of Sample Size in Health Studies. Published on Behalf of the World Health Organization. Chichester: John Wiley & Sons; 1990.
  2. Charles P, Giraudeau B, Dechartres A, Baron G, Ravaud P. Reporting of sample size calculation in randomised controlled trials: review. BMJ. 2009; 338:b1732.
  3. Chan AW, Hróbjartsson A, Jørgensen KJ, Gøtzsche PC, Altman DG. Discrepancies in sample size calculations and data analyses reported in randomised trials: comparison of publications with protocols. BMJ. 2008; 337:a2299.
  4. Rutterford C, Taljaard M, Dixon S, Copas A, Eldridge S. Inadequate reporting of sample size calculations in cluster randomised trials: a review. Trials. 2013; 14(Suppl1):122.
  5. Abdul Latif L, Daud Amadera JE, Pimentel D, Pimentel T, Fregni F. Sample size calculation in physical medicine and rehabilitation: a systematic review of reporting, characteristics, and results in randomized controlled trials. Arch Phys Med Rehabil. 2011; 92(2):306- 15.
  6. Whitley E, Ball J. Statistics review 4: Sample size calculations. Crit Care. 2002; 6(4):335-41.
  7. Schlesselman JJ. Sample size requirements in cohort and case-control studies of disease. Am J Epidemiol. 1974; 99(6):381-4.
  8. Altman DG. Statistics and ethics in medical research, III: how large a sample? Br Med J. 1980; 281(6251):1336-8.
  9. Bacchetti P, Wolf LE, Segal MR, McCulloch CE. Ethics and sample size. Am J Epidemiol. 2005; 161(2):105–10.
  10. Lenth RV. Some practical guidelines for effective sample size determination. Am Stat. 2001; 55(3):187-93.
  11. Sergeant ESG. Epitools epidemiological calculators. AusVet Animal Health Services and Australian Biosecurity Cooperative Research Centre for Emerging Infectious Disease [Online]. 2014 [cited 2014 June]. Available from http://epitools.ausvet.com.au.
  12. Meysamie A, Taee F, Mohammadi-Vajari M-A, Yoosefi-Khanghah S, Emamzadeh-Fard S, Abbassi M. Sample size calculation on web, can we rely on the results? J Med Stat Inform. 2014; 2:3.