Controlling the False Discovery Rate in Astrophysical Data Analysis - Robotics Institute Carnegie Mellon University

Controlling the False Discovery Rate in Astrophysical Data Analysis

C. Miller, C. Genovese, R. Nichol, L. Wasserman, A. Connolly, D. Reichart, A. Hopkins, J. Schneider, and A. Moore
Journal Article, Astronomical Journal, Vol. 122, No. 6, pp. 3492 - 3505, December, 2001

Abstract

The false-discovery rate (FDR) is a new statistical procedure to control the number of mistakes made when performing multiple hypothesis tests, i.e., when comparing many data against a given model hypothesis. The key advantage of FDR is that it allows one to a priori control the average fraction of false rejections made (when comparing with the null hypothesis) over the total number of rejections performed. We compare FDR with the standard procedure of rejecting all tests that do not match the null hypothesis above some arbitrarily chosen confidence limit, e.g., 2 σ, or at the 95% confidence level. We find a similar rate of correct detections, but with significantly fewer false detections. Moreover, the FDR procedure is quick and easy to compute and can be trivially adapted to work with correlated data. The purpose of this paper is to introduce the FDR procedure to the astrophysics community. We illustrate the power of FDR through several astronomical examples, including the detection of features against a smooth one-dimensional function, e.g., seeing the "baryon wiggles" in a power spectrum of matter fluctuations, and source pixel detection in imaging data. In this era of large data sets and high-precision measurements, FDR provides the means to adaptively control a scientifically meaningful quantity—the fraction of false discoveries over total discoveries.

BibTeX

@article{Miller-2001-119730,
author = {C. Miller and C. Genovese and R. Nichol and L. Wasserman and A. Connolly and D. Reichart and A. Hopkins and J. Schneider and A. Moore},
title = {Controlling the False Discovery Rate in Astrophysical Data Analysis},
journal = {Astronomical Journal},
year = {2001},
month = {December},
volume = {122},
number = {6},
pages = {3492 - 3505},
}