Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jul 8:10:209.
doi: 10.1186/1471-2105-10-209.

A new multitest correction (SGoF) that increases its statistical power when increasing the number of tests

Affiliations

A new multitest correction (SGoF) that increases its statistical power when increasing the number of tests

Antonio Carvajal-Rodríguez et al. BMC Bioinformatics. .

Abstract

Background: The detection of true significant cases under multiple testing is becoming a fundamental issue when analyzing high-dimensional biological data. Unfortunately, known multitest adjustments reduce their statistical power as the number of tests increase. We propose a new multitest adjustment, based on a sequential goodness of fit metatest (SGoF), which increases its statistical power with the number of tests. The method is compared with Bonferroni and FDR-based alternatives by simulating a multitest context via two different kinds of tests: 1) one-sample t-test, and 2) homogeneity G-test.

Results: It is shown that SGoF behaves especially well with small sample sizes when 1) the alternative hypothesis is weakly to moderately deviated from the null model, 2) there are widespread effects through the family of tests, and 3) the number of tests is large.

Conclusion: Therefore, SGoF should become an important tool for multitest adjustment when working with high-dimensional biological data.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Power with different number of tests. Percentage (%) of power for different number of tests. The family of tests was the one-sample t tests with 20% of them coming from a N(0.36, 1) and sample size 20. Values are averages through 1,000 replicates. Error bars represent standard deviations between replicates. The power is defined as the number of true discoveries divided by the total of existing effects (false null hypotheses).
Figure 2
Figure 2
False discovery rate with different sample sizes. Percentage (%) of FDR for different sample sizes. The family of tests was 10,000 one-sample t tests with 5% of them coming from a N (0.36, 1). Values are averages through 1,000 replicates.
Figure 3
Figure 3
Comparison of the multitest adjustments for one-sample t tests. Number of true and false discoveries obtained under the different multitest adjustment methods over a varying proportion (% Effect) of the alternative hypothesis contributing to the family of comparisons. The sample size of each one-sample t test was intermediate (N = 10). The alternative hypothesis represents Weak or Strong deviations from the null one. The absolute number of detected true discoveries among 10,000 is shown on the left side, while the absolute number of false discoveries is presented on the right side. Values are averages through 1,000 replicates.
Figure 4
Figure 4
Comparison of the multitest adjustments for homogeneity tests. Number of true and false discoveries obtained under the different multitest adjustment methods over a varying proportion (% Effect) of the alternative hypothesis contributing to the family of comparisons. The sample size of each homogeneity test was small (N = 20). The alternative hypothesis represents Weak or Strong deviations from the null one. The absolute number of detected true discoveries among 10,000 is shown on the left side, while the absolute number of false discoveries is presented on the right side. Values are averages through 1,000 replicates.

Similar articles

Cited by

References

    1. Sokal RR, Rohlf FJ. Biometry. Second. New York: W. H. Freeman and Co; 1981.
    1. Rice WR. Analyzing tables of statistical tests. Evolution. 1989;43:223–225. doi: 10.2307/2409177. - DOI - PubMed
    1. Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society Series B (Methodological) 1995;57:289–300.
    1. Pounds S, Morris SW. Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of p-values. Bioinformatics. 2003;19:1236–1242. doi: 10.1093/bioinformatics/btg148. - DOI - PubMed
    1. Kendziorski C, Irizarry RA, Chen KS, Haag JD, Gould MN. On the utility of pooling biological samples in microarray experiments. Proc Natl Acad Sci USA. 2005;102:4252–4257. doi: 10.1073/pnas.0500607102. - DOI - PMC - PubMed

Publication types

LinkOut - more resources

-