Markov chain Monte Carlo for the Bayesian analysis of evolutionary trees from aligned molecular sequences

Markov chain Monte Carlo for the Bayesian analysis of evolutionary trees from aligned molecular sequences

Michael A. Newton , Bob Mau, and Bret Larget.

Technical Report 983, Department of Statistics, University of Wisconsin, Madison.

First issued October 1997, revised March, 1998.

Abstract:

We show how to quantify the uncertainty in a phylogenetic tree inferred from molecular sequence information. Given a stochastic model of evolution, the Bayesian solution is simply to form a posterior probability distribution over the space of phylogenies. All inferences are derived from this posterior, including tree reconstructions, credible sets of good trees, and conclusions about monophyletic groups, for example. The challenging part is to approximate the posterior, and we do this by constructing a Markov chain having the posterior as its invariant distribution, following the approach of Mau, Newton, and Larget (1996). Our Markov chain Monte Carlo algorithm is based on small but global changes in the phylogeny, and exhibits good mixing properties empirically. We illustrate the methodology on DNA encoding mitochondrial cytochrome oxidase~1 gathered by Hafner {\it et al.} (1994) for a set of parasites and their hosts.


For the Proceedings of the AMS-IMS-SIAM Joint Summer Research Conference on Statistics and Molecular Biology, held in Seattle, Washington, June 22-26, 1997.

Postscript copy