DBmcmc
Inferring Dynamic Bayesian Networks with MCMC

© Copyright 2003

Dirk Husmeier
Biomathematics and Statistics Scotland (BioSS)
JCMB, The King's Building, Edinburgh EH9 3JZ, United Kingdom


A dynamic Bayesian network with complete observation is modelled via a static Bayesian network. The layer of nodes is duplicated, representing observations at two consecutive time steps. Edges are only permitted between these two layers. Parameter tying is used, meaning that the parameters of the conditional probabilities are the same at all times.



Preliminaries

DBmcmc is a free software package written in MATLAB for inferring dynamic Bayesian networks with MCMC. The programs invoke functions of the Bayes Net Toolbox written by Kevin Murphy; so you need to download his software package first. To download the version of BNT that was used for the present software implementation, click here. Note that DBmcmc was developed solely as a model exploration tool, and it is provided without guarantee of maintenance or support, and without warranty. The copyright holder is not liable for any damages which may result in any manner from the use of this software.

Also note that the version of the Bayes Net Toolbox used for DBmcmc does not work properly for versions of MATLAB more recent than 6.1. If you use a more recent version of MATLAB, you won't be able to install the C functions that come with the Bayes Net Toolbox. This bug might have been fixed in more recent versions of the Bayes Net Toolbox, but DBmcmc has only been tested with the older version.


Download the software

To download the software, click here. This will give you a gzipped TAR file called DBmcmc.tar.gz. To extract this file under UNIX, proceed as follows:

gunzip DBmcmc.tar.gz
tar xvf DBmcmc.tar

You can now delete the TAR file:

rm DBmcmc.tar

Make sure you add the directory with the programs to you MATLAB path. You can do this from the MATLAB prompt as follows:

DBmcmc_HOME='XXX';
eval(sprintf('addpath ''%s'' ', DBmcmc_HOME));

where XXX is the name of the directory in which you keep the DBmcmc programs.


Program files

The software package contains several functions, called DBmcmc_XXXX.m (where XXXX are different names). Information about each function is obtained by typing help plus the function name (without the extension ".m") at the MATLAB prompt.

help DBmcmc_XXXX


Getting started

First, add the directory in which you keep the BNT programs to your Matlab path. To do this, go to the directory in which you keep the BNT software, and type

add_BNT_to_path

(assuming you have edited the file add_BNT_to_path.m so as to replace the default path name by your own). Next, install the C programs for BNT by giving the command

installC

at the MATLAB prompt. This works fine for MATLAB Version 6.1, but not for more recent versions of MATLAB (unless this problem has been fixed by now, that is, in versions of BNT more recent than the one I was using when writing the programs). Finally, add the directory in which you keep the DBmcmc programs to the Matlab path. To automate this, edit the file add_path.m you find in the directory, replacing the default path name by the name of the path in which you keep the DBmcmc programs.


Setting options and parameters

You need to specify several options for the MCMC simulation. You can set these options interactively at run time. Alternatively, you can set them in advance of the MCMC simulation by calling function DBmcmc_SetParMCMC.m, in which case the parameters will be written out to the mat file mcmcPAR.mat. Calling DBmcmc_SetParMCMC.m prompts you for the following parameters: The last two options allow you to specify the prior in terms of a sharp cut-off on the maximum fan-in to a node, and the parameter of an exponential distribution for the number of nodes with non-zero fan-out.

To see how the MCMC parameters have been set, load and display the file mcmcPAR.mat:

load mcmcPAR
mcmcPAR

This may give, for example:

seed: 11
nBurnIn: 50000
nSample: 50000
nDelta: 100
maxFanIn: 3
EmaxNodeFanOut: 0


Run the program

To run the program, call function DBmcmc_Application.m. This program takes as a mandatory argument a training set in the form of a matrix, where rows are genes and columns are time points. The data must either be binary (0,1) or trinomial (-1,0,1). There are several optional parameters that can be passed to this function. Type

help DBmcmc_Application

at the MATLAB prompt for further information.

To create a training set for testing the inference procedure, use the following program.

Assuming you have already generated the file mcmcPAR.mat, as described above, you can run the program as follows:

load mcmcPAR
load data_example
DBmcmc_Application('data',data_example)

The results are written out to file Results.mat. Note that an already existing file with the same name will be over-written.


Output

The results of the MCMC simulation can be plotted with function DBmcmc_PlotResults.m. You can use this function in different ways; type

help DBmcmc_PlotResults

for further information. The simplest application is

load Results.mat
DBmcmc_PlotResults(Results)

which generates three plots. The first two plots monitor the convergence of the Markov chain and show

(where time is actually the number of MCMC steps). The third plot shows the predicted interslice connectivity natrix, plotting all edges with a posterior probability greater than 0.5. If you want to change the threshold and plot, say, only edges with posterior probability greater than 0.7, type:

DBmcmc_PlotResults(Results,'threshold',0.7)

If you want to play around with different thresholds without always plotting the MCMC convergence monitoring graphs again (as this may be time consuming), type:

DBmcmc_PlotResults(Results,'threshold',0.7,'flag_onlyDAG',1)


Example

Use MATLAB program SyntheticTimeSeriesFriedYeast.m to generate synthetic data from a known network, which you can draw with MATLAB function DrawSyntheticModel.m.

Back to my homepage