Abstract
Validation studies have been used to increase the reliability of the statistical conclusions for scientific discoveries; such studies improve the reproducibility of the findings and reduce the possibility of false positives. Here, one of the important roles of statistics is to quantify reproducibility rigorously. Two concepts were recently defined for this purpose: (i) rediscovery rate (RDR), which is the expected proportion of statistically significant findings in a study that can be replicated in the validation study and (ii) false discovery rate in the validation study (vFDR). In this paper, we aim to develop a nonparametric approach to estimate the RDR and vFDR and show an explicit link between the RDR and the FDR. Among other things, the link explains why reproducing statistically significant results even with low FDR level may be difficult. Two metabolomics datasets are considered to illustrate the application of the RDR and vFDR concepts in high-throughput data analysis.
Original language | English |
---|---|
Pages (from-to) | 3203-3212 |
Number of pages | 10 |
Journal | Statistics in Medicine |
Volume | 35 |
Issue number | 18 |
DOIs | |
State | Published - 15 Aug 2016 |
Bibliographical note
Funding Information:Woojoo Lee was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2013R1A1A1061332). Donghwan Lee was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2015R1C1A1A01055524) and also supported by the 2014 Ewha Womans University Research Grant.
Publisher Copyright:
Copyright © 2016 John Wiley & Sons, Ltd.
Keywords
- false discovery rate
- multiple testing
- rediscovery rate
- validation study