On the statistical analysis of allelic-loss data

On the statistical analysis of allelic-loss data

Michael A. Newton , Michael N. Gould, Catherine A. Reznikoff, and Jill D. Haag

Statistics in Medicine 17 , 1425-1445, 1998.

Formerly, Technical Report 102, Department of Biostatistics, University of Wisconsin, Madison. First issued May 1996, Last revised September 1997.


This article concerns the statistical analysis of certain binary data arising in molecular studies of cancer. In allelic-loss experiments, tumor cell genomes are analyzed at informative molecular marker loci to identify deleted chromosomal regions. The resulting binary data are used to infer the locations of putative suppressor genes, genes whose function maintains the normal cell cycle characteristics. Various factors can complicate this inference, including background loss of heterozygosity, spatial (i.e., within chromosome) dependence of the binary responses, noninformativeness of markers, covariates such as protein levels or tumor histology, heterogeneity of cells within tumors, and measurement error. We focus on the first three factors, discussing methods for statistical inference that separate background loss from significant loss. The extension to other inferences is outlined, such as comparison questions and the relationship to covariates. Using characteristic features of tumorigenesis, we present a framework for the stochastic modeling of allelic-loss data, and build models within this framework; in particular, we propose a simple model having chromosome breaks at locations of a Poisson process, and preferential selection of cells with inactivated suppressor genes. We demonstrate these methods on allelic-loss data from induced rat mammary tumors and human bladder cancers.

Contact for reprints