- 2001 Spring / 2003 Spring: Statistical Genomics
(Brian Yandell )

The focus of this course in Spring 2003 is on statistical genomic issues arising primarily in gene mapping for experimental crosses. Much attention will be on quantitative trait loci (QTL), with emphasis on practical issues of "model selection", finding the genetic architecture "best" supported by the phenotypic and genotypic data in hand. We will devote considerable attention to recent studies that use microarray data as complex phenotypic traits. Strategies for fine-mapping will be addressed along the way. The primary text is the draft of a book being written jointly with Zhao-Bang Zeng, Gary Churchill, and Karl Broman. Intended audience is primarily biologists wanting to gain a deeper understanding of concepts and strategic issues. Basic ideas of key methods will be developed with considerable attention to analysis of published data. - 2002 Fall: Statistical Methods for Human Genetics (Jason Fine)
- 2002 Fall: Statistical Methods in Genomics
(Bob Mau
and
Nicole Perna)

Statistical analysis of whole genomes. Material covered included extreme value statistics for sequence similarity searches, r-scan statistics to assess the distribution of specific sequence motifs, correspondence analysis for analyzing codon preferences, multiple comparison issues in microarray analysis, detection of recombination and horizontal transfer events from nucleotide base composition, and global phylogenetic inference made possible by multiple whole genome alignment. - 2003 Spring: Statistical Methods for Analysis of Microarray Data
(Christina
Kendziorski )

This course will provide an introduction to statistical methods and associated freeware tools developed to address questions in gene expression array studies. The course will begin with an overview of image analysis including issues related to intensity estimation and background correction. Experimental design will then be discussed. Oftentimes in microarray experiments, due to high costs, there are few replicates for any one given experiment. Methods to maximize the amount of information obtained in a set of comparison experiments with few replicates will be reviewed along with other considerations in experimental design such as normalization, labelling, pooling, and sample size estimation. We will then focus on exploratory tools such as hierarchical clustering methods and principal components analysis. Finally, we will consider a number of methods to estimate differential expression and identify significant differential expression across multiple conditions. The intended audience consists of graduate students, post-doctoral students, and researchers in statistics or molecular genetics with an interest in statistical methods used in expression array studies. Although there are no formal prerequisites, it is recommended that students at least be familiar with topics covered in an introductory statistics course (e.g. STAT 310-311, STAT 541, STAT 571-572). - 2004 Spring: Statistical Phylogenetics
(Bret Larget)

The course will include these topics: (1) mathematical description of phylogenetic trees, (2) the estimation of phylogenetic trees from aligned DNA sequence data using maximum likelihood, parsimony, distance, and Bayesian methods, (and supporting probability and statistics topics including likelihood, continuous-time Markov chains, the parametric and nonparametric bootstrap, and Markov chain Monte Carlo), (3) comparisons of statistical properties of different phylogeny estimators, (4) the comparison of the bootstrap and Bayesian posterior probabilities for assessing uncertainty in phylogeny estimation, (5) the estimation of phylogeny from genome arrangement data, and (6) additional topics as time permits. Possible additional topics include statistical tests of tree topology, model selection, and statistical models of coevolution. - 2005 Fall: Statistical Phylogenetics: Comparative methods
(Cecile Ane)

Comparative biologists ask questions about evolutionary processes. Usually, their observational units are species, which typically do not yield random samples. This is because the sampled species share an evolutionary history, with closely related species usually being more alike than distantly related species, and observations lack independence. In this course, we will cover the major advances in comparative methodology for both discrete and continuous data. All these methods use the genealogical history of the sampled species to overcome their non-independence. The most widely used method is based on modeling the evolution of a character with a Brownian motion on a tree. A more recent model uses the Ornstein-Uhlenbeck process to account for biological selection. In the second part of the course, we will cover methods for inferring phylogenies (i.e tree-like histories of species) from molecular data, including semi-parametric methods for estimating divergence times. As this course complements other recent courses o_ered on campus on molecular evolution and phylogenetic inference, I will adapt the second part of the course to the background and interests of the students. My objective for statistics students is to provide them with a sufficient background in statistical phylogenetics and comparative studies so that they can start exploring their own research questions in the area. My objective for biology students is to provide them with a deeper understanding of the statistical methods available in the area, so that they can do the best choices for their own data, and pull new methodological developments towards what their needs are, through collaborations. For all students, my objective is to give them a taste of fruitful cross-disciplinary work. - 2006 Spring: Statistical Methods for Biological Sequence Analysis
(Sunduz Keles)

This course will cover sequence analysis topics from the field of computational biology. One of the aims of the course is to give a concise review of relevant background biology and an introduction to the statistical problems arising recently. A major portion of the course will be dedicated to rigorous overview of the statistical methods utilized in this field. Particular inference topics will include cross-validation both with observed and censored data, multiple hypothesis testing, mixture models, HMMs and tree based regression techniques. The plan is to make the topics self contained so that 1st year level background on statistical estimation and inference is sufficient. This entitles to Stat 609-610 for Statistics students and Stat 571-572 for Biological Sciences students. - 2006 Spring: Topics in high dimensional statistical inference
Michael Newton)

A traditional model in statistics entails observations sampled from a fixed, possibly complicated, population. Inference about parameters describing this population is based on approximations in which the number of parameters does not increase with the sample size. Often the context prescribes a different model, in which parameter dimension is tied to sample size. By reviewing early and contemporary literature, we will study a range of topics related to parameter-rich statistics. In Part I, we will review the classical view, some difficulties that arise, connections between frequentist and Bayesian perspectives, and empirical Bayesian methodology. In Part II, we will study Bayesian methodology, considering both parametric and nonparametric hierarchical models, and we will review computational approaches to model fitting. Part III concerns recent advances, including new techniques for high-dimensional testing and estimation.

Brian Yandell Last modified: Fri Feb 9 12:15:31 CST 2007