Application of Bayesian networks and MCMC in computational molecular biology

In my talk, I will present two examples of the application of Bayesian networks and Markov chain Monte Carlo (MCMC) in computational molecular biology: the reverse engineering of biochemical networks, and the detection of recombination in DNA sequence alignments.

Part 1: Reverse engineering of biochemical networks
Robust and adaptable metabolic processes in cells depend on the operation of complex regulatory biochemical networks. The elucidation of the structure and functioning of such networks is becoming a prime goal of molecular biology. Recently developed experimental high-throughput techniques, like gene expression profiling with microarrays, provide detailed information on the molecular processes in cells. What is required is the development of statistical and computational schemes to infer the underlying signal transduction pathways and biochemical networks. This inference problem is particularly hard in that interactions between hundreds of genes have to be learned from very sparse data, typically containing only a few dozen time points during a cell cycle. The objective of the first part of my talk is to compare the performance of two inference methods - mutual information relevance networks versus dynamic Bayesian networks - in a realistic simulation study. First, gene expression data are simulated from a realistic biological network involving DNAs, mRNAs, inactive protein monomers, and active protein dimers. Then, interaction networks are inferred from these data in a reverse engineering approach, using pairwise mutual information scores (relevance networks) or Bayesian learning with Markov chain Monte Carlo (Bayesian networks).

Part 2: Detection of recombination in DNA sequence alignments
Sporadic recombination is a process by which certain bacteria and viruses exchange DNA/RNA subsequences, leading to so-called mosaic strains. The discovery of a surprisingly high frequency of such mosaic strains in HIV suggests that recombination between their genomes can occur in vivo to generate new biologically active viruses. A phylogenetic analysis of various bacterial genera suggests that recombination is an important, and previously underestimated, source of genetic diversification, by which new strains can occur with undesirable biological traits (like multiple resistance to antibiotics). In this second part of my talk, I will start with a brief recapitulation of molecular phylogenetics, taking a Bayesian network approach. I will then discuss how phylogenetic methods can be combined with MCMC to detect recombination in DNA sequence alignments. An application of this scheme to a DNA sequence alignment of 10 strains of Hepatitis-B virus will be discussed.


Last update: 12 November 2003.