Tutorial A:
Introduction to phylogenetics
The first part of the tutorial discusses the reconstruction of the
evolutionary history of a group of species, depicted in a so-called phylogenetic tree, from a DNA sequence alignment.
Besides being of fundamental importance in itself - aiming to
estimate, for instance, the ancestry of the human race or to
infer the whole tree of life - this methodology has recently
become of practical relevance in epidemiology
and forensic science. The tutorial will start with a
brief discussion of the shortcomings of the 'classical'
clustering methods, which will then be contrasted with the
newer probabilistic approach. Based on a concrete model of the
evolutionary process in terms of a homogeneous Markov chain, a
phylogenetic tree can be interpreted as a probabilistic
generative model that allows the calculation of the likelihood
of the observed DNA sequence alignment. The practical
computation draws on well-established algorithms for directed
acyclic graphs, which pass 'messages' from the external nodes
along the branches and inner nodes down to the root. This, in
principle, enables the optimization of both the parameters and
the model, that is, the branch lengths and the tree topology,
in a maximum likelihood sense. The tutorial discusses the
question of statistical significance of the results, and
contrasts the two predominant methods of significance
estimation: bootstrapping versus the Bayesian
approach with Markov chain Monte Carlo.
The methods described will be used in the second part
of the tutorial.
Tutorial B:
Detecting recombination in DNA sequence alignments
The recent advent of multiple-resistant pathogens
has led to an increased interest in recombination
as an important, and previously underestimated, source of genetic
diversification in bacteria and viruses. In the second part of
the tutorial, I will describe a statistical method for detecting recombination in multiple DNA sequence alignments.
This approach is based on the combination of two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph
(hidden Markov model) representing interactions between different
sites in the DNA sequence alignment.
I will compare three different parameter estimation techniques,
and will discuss the results obtained on various
synthetic and real-world DNA sequence alignments.
Lecture:
Interpreting microarray data and modelling genetic
regulatory interactions with Bayesian networks
Molecular pathways consisting of interacting proteins underlie the major functions of living cells. A central goal of molecular biology is therefore to understand the regulatory mechanism that governs protein synthesis and activity.
While traditional methods in molecular biology could only report the expression
levels of single genes, microarrays measure the abundance of thousands of mRNA targets simultaneously. This provides new rich data for understanding gene expression and regulation.
In my talk I will start with a concise yet self-contained introduction to
probabilistic modelling with Bayesian networks. I will then show how
these models can be applied to the analysis of microarray experiments
to infer gene regulatory interactions: Groups of genes, which by
correlation analysis alone are simply clustered together, can be organized in clear
functional subnetworks. These subnetworks provide a much richer context for
regulatory and functional analysis and assist us in understanding the roles of genes and in assigning them putative novel functions.
Last update: 24 July 2002.