How do the results obtained with BARCE depend on the prior and on the update scheme for the nucleotide substitution parameters?

Dirk Husmeier and Grainne McGuire
Biomathematics and Statistics Scotland (BioSS)
SCRI, Dundee DD2 5DA, United Kingdom
March 2002

Dependence of the results on the prior
Dependence of the results on the parameter update scheme

Dependence of the results on the prior

To test how sensitive the performance is with respect to changes in the prior distribution of the recombination parameter lambda, we repeated the MCMC simulation on one of the data sets (mosaic structure A, tree height 0.2) with two different prior distributions, where the shape parameter of the beta distribution was set to 0.7 and 0.5. This is effected by changing the following line in the model settings menu:

Difficulty of changing trees D 0.9

Difficulty of changing trees D 0.7

Difficulty of changing trees D 0.5

Otherwise, the settings were left unchanged.

Results

Prior mean of lambda Sensitivity Specificity Relative entropy Average log likelihood plus log prior
0.5
94.5
99.8
0.081
-3489
0.7
94.5
99.8
0.076
-3488
0.9
94.5
99.8
0.074
-3487

Changing the prior has a small impact on the probability distribution, as indicated by the small change in the entropy (column 4) and the average (unnormalized) log posterior (column 5). However, this does not affect the classification performance, which remains invariant with respect to changes in the prior distribution. Also, from a visual inspection of the site-dependent probability for the topologies , differences are hardly noticeable.


Dependence of the results on the parameter update scheme

How sensitive is the performance with respect to changes in the update scheme for the evolutionary parameters? To test this, we repeated the simulations with the F84 model, using the same sequence alignment as before (mosaic structure A, tree height 0.1). However, rather than sample the nucleotide substitution parameters from the posterior distribution with MCMC, we kept some or all of them constant (at the initially estimated values). Three simulations were carried out:
  1. All parameters are updated with MCMC.
  2. Only the transition-transversion ratio is updated with MCMC, while the nucleotide frequencies are kept fixed.
  3. Both the transition-transversion ratio and the nucleotide frequencies are kept fixed.
This is effected with the following settings in the run settings menu:

1) Nucleotide frequencies and transition-transversion ratio adapted:

Update stationary frequencies in MCMC algorithm U YES
Update transition-transversion ratio in MCMC algorithm A YES

2) Nucleotide frequencies fixed, transition-transversion ratio adapted:

Update stationary frequencies in MCMC algorithm U NO
Update transition-transversion ratio in MCMC algorithm A YES

3) Nucleotide frequencies and transition-transversion ratio fixed:

Update stationary frequencies in MCMC algorithm U NO
Update transition-transversion ratio in MCMC algorithm A NO

Otherwise, the settings were left unchanged. Note that the initial parameters were always estimated from the data (see model submenu):

Estimate initial character frequencies from data E YES
Estimate transition/transversion ratio from data R YES

Results

Nucleotide frequencies Transition-transversion ratio Sensitivity Specificity Relative entropy Average log likelihood plus log prior
adapted
adapted
94.5
99.8
0.074
-3487
fixed
adapted
93.8
99.8
0.079
-3489
fixed
fixed
93.8
99.8
0.082
-3492

Not updating the parameters of the nucleotide substitution model leads to

However, these differences are only marginal, and are hardly noticeable from a visual inspection of the site-dependent probability for the topologies.

Back to the main page.