Markov Chain Monte Carlo
In class we discussed Markov Chain Monte Carlo (MCMC)
in three different contexts:
randomly selecting a two-way contingency table
from the uniform distribution of all contingency tables
with given fixed row and column sums,
estimating a mean from a normal distribution
when assuming a Bayesian uniform prior distribution on the mean,
and in the problem of reconstructing evolutionary trees
from aligned DNA sequence data.
These note summarize the essentials of these discussions.
Generally speaking,
MCMC provides a mechanism for taking dependent samples
in situations where regular sampling is difficult,
if not completely intractable.
The standard situation is where it is difficult to normalize
a posterior probability distribution.
For the contingency table example,
when the row and column sums are large,
it is very difficult to count the number of tables
with the same row and column sums,
much less to order them in a formal manner which would allow
random sampling.
For the evolution problem,
normalizing the posterior distribution
on a problem with just 10 different species
would require computing a couple million separate integrals over 18 dimensional
space.
The Metropolis Algorithm offers an alternative computational
approach.
The Metropolis Algorithm
To implement this algorithm,
you need a reversible Markov chain to propose new states
from any specified given state,
the ability to compute the ratio of the posterior densities (probabilities)
for any pair of states,
and a random number generator.
(There are also ways to handle non-reveersible Markov chains.)
Notice that if the posterior density at state x, p(x),
equals h(x) / C
where C is hard to compute,
but h(x) is computable,
the ratio of posterior densities at x and y equals p(x) / p(y) = h(x) / h(y).
Here are the basic steps of the algorithm.
- Choose an initial current state x0.
- Propose a new state x* from the Markov chain
beginning at the current state x.
- Find R, the ratio of likelihoods of x* over x.
- Generate a uniform random number U.
- If R is greater than U,
the proposed state becomes the current state.
Otherwise, the current state is recorded an additional time
as the current state.
- Return to step 2.
This algorithm is repeated for a very long simulation run.
The distribution of the final state
will be approximately that of one chosen randomly from the posterior
distribution.
In practice,
if you are interested in a parameter value such as the proportion of
contingency tables with the same row and column sums
as a given table
with a chi-square statistic greater than that of the given table,
the long-run proportion of tables in the sequence
will converge to this number.
Last modified: May 2, 1997
Bret Larget,
larget@mathcs.duq.edu