Improving MCMC
by
Metropolis coupling of different trajectories
Dirk Husmeier
Biomathematics and Statistics Scotland (BioSS)
SCRI, Dundee DD2 5DA, United Kingdom
March 2002
Just as maximum likelihood searches can get trapped
in local maxima, a single MCMC trajectory
of finite length might not
correctly measure the posterior probability landscape
of the parameters of interest.
The simulation studies described below show how combining
several trajectories by Metropolis coupling can dramatically
improve the results of an MCMC simulation.
Just as maximum likelihood searches can get trapped
in local maxima, a single MCMC trajectory
of finite length might not
correctly measure the posterior probability landscape
of the parameters of interest.
This especially happens when deep
valleys in the posterior probability
landscape separate isolated probability peaks,
from which it is thus difficult to escape.
A possible solution to this problem is to
run several MCMC simulations, started from different
initializations, and thereby to create different MCMC
trajectories.
This will (hopefully) explore different regions of the
parameter space, and, by reducing the susceptibility
to local entrapment, should improve the overall exploration
of the posterior probability landscape.
Assume you have simulated K
trajectories.
To correctly combine the sampled parameters
along these trajectories,
we use a method similar to
MCMCMC (Metropolis-coupled Markov chain Monte
Carlo, first suggested by
Geyer and applied
to phylogenetics by
Huelsenbeck and Ronquist),
which uses Metropolis-coupling
of the trajectories, but does not apply the
annealing scheme of the original MCMCMC algorithm.
To get started, randomly select a trajectory.
Then proceed as follows. Assume that
at time step t you are on trajectory
k_t, where you have just sampled parameters
q[t,k_t].
Next, you propose
a new trajectory k_new from the
remaining K-1
trajectories (sampled from a uniform distribution). The
decision of whether to accept the move,
k_(t+1)=k_new,
or to reject it, k_(t+1)=k_t,
follows the standard
Metropolis criterion, using the unnormalized posterior
probabilities
P(D|q[t,k_t])P(q[t,k_t]) .
To improve the prediction with
BARCE,
we have implemented the algorithm in the MATLAB program
PredictorEnsemble.m.
To find out more about the options, type
help PredictorEnsemble
at the MATLAB prompt.
To use the program, type
P_averaged= PredictorEnsemble(P,L)
-
The output,
P_averaged, is an
N-by-3 matrix of average posterior
probabilities (averaged over the Metropolis-coupled
MCMC trajectories), where the columns represent the
three possible tree topologies,
and the rows represent the sites in the DNA sequence
alignment (N is the total length of the
alignment).
-
P
is an
N-by-3-by-K
array of the averaged
probabilities from the K
individual MCMC trajectories, represented by the
third dimension of the array.
This matrix is created from the
BARCE output file
topol_prob.out.
For example,
if you run two simulations and keep the results
in the directories simu1 and
simu2, you have to
create the matrix as follows (in MATLAB):
cd simu1
load topol_prob.out;
P(:,:,1)=topol_prob;
cd ..
cd simu2
load topol_prob.out;
P(:,:,2)=topol_prob;
cd ..
-
L
is an N-by-K array
of unnormalized log posterior
probabilities (that is, log likelihood plus log prior),
where the columns represent different MCMC trajectories,
and the rows represent time steps in the MCMC simulations.
The unnormalized log posterior probabilities
are obtained from BARCE
output file
lpdfile.out.
For example,
if you run two simulations and keep the results
in the directories simu1 and
simu2, you have to
create the matrix as follows (in MATLAB):
cd simu1
load lpdfile.out;
L(:,1)=lpdfile;
cd ..
cd simu2
load lpdfile.out;
L(:,2)=lpdfile;
cd ..
Two simulation studies have been carried out:
The simulation studies discussed above show
how Metropolis coupling of trajectories
can improve the results of MCMC.
In the first study, the
individual MCMC simulations led to
different and contradicting predictions.
Two of the simulations got trapped in meta-stable
regions of the parameter space, but this is difficult
to tell from an
analysis of the
MCMC trajectories.
A naive averaging is suboptimal because it mixes
the converged trajectory with those
entrapped in the meta-stable domains.
However, by using Metropolis-coupling,
those trajectories belonging to the
meta-stable domains are nearly entirely suppressed.
Consequently, the proposed scheme
effectively selects a converged
MCMC trajectory among a set of alternative, unconverged
ones. In the second study,
both individual MCMC trajectories
got trapped in subregions of the parameter space,
where they only captured a partial aspect of
the recombination scenario (either certainty of recombination,
or no recombination at all).
This typically happens when two high-probability
domains in parameter space are separated by
a deep, low-probability valley, which an individual
trajectory may find difficult to cross.
The combination of different trajectories thus
allows the exploration of a larger region of the
parameter space, as a result of which
we get a more reliable
measure of the posterior probability landscape.
Back to the BARCE manual
Back to my homepage.