function [substitutionMatrix,rateMatrix]=... Barge_NucSubstMatrix(mu,ttratio,PeqNuc,useExpectedTransversions) % Computes the nucleotide substitution matrix for HKY85 % with branch lengths integrated out. % Dirk Husmeier, October 2005 % % Modified to use expected number of mutations, % Various corrections in the comments % Wolfgang Lehrach, May 06 % % The HKY model of nucleotide substitution % has the following form: % exp(t*A)= sum_{i=1}^4 exp(t*lambda_i) u_i*v_i' % where t is physical time, lambda_i are the eigenvalues % of the rate matrix A, and u_i, v_i are given in % HKY85: J. Mol. Evol. 22: 160-174. % % When useExpectedTransversions is enabled: % Define the branch length as beta*t, where % beta is the transversion rate. The transition-transversion % ratio is ttratio= alpha/beta. This gives for % the w-dependent mutation probability matrix: % MutMatrix= sum_{i=1}^4 exp(-w*lambda~_i) u_i*v_i' % where lambda~_i = -lambda_i/beta. % %I have modified equation 5a in HKY85 accordingly; % equations 5b-c remain unchanged. % According to Suchard et al. (2003), JASA 98, 427-437, % I have placed an exponential prior on the branch lengths: % P(w)= 1/mu exp(-w/mu) % The branch lengths can then be integrated out. % Since int exp(-w*lambda~_i) P(w) dw= 1/(mu*lambda~i + 1), % this gives: % MutMatrix= sum_{i=1}^4 1/(mu*lambda~i + 1) u_i*v_i' % Note % The vectors u_i and v_i are taken from equations 5b-c in HKY. % The eigenvalues lambda_i are taken from equation 5a in HKY. % These eigenvalues are divided by beta to obtain % lambda_i --> lambda~_i. % The reason for this operation is that the prior % is defined over the branch lengths w=beta*t rather % than physical time t. % % When useExpectedTransversions is not enabled, % for the definition of the rescaling to get expected number of % mutations, see report. % % INPUT % - mu % Hyperparameter of the prior on the branch lengths % (scale parameter) % - ttratio % Transition-transversion ratio % - PeqNuc % Equilibrium probabilities of the nucleotides: % A: PeqNuc(1), C: PeqNuc(2), G: PeqNuc(3), T: 1-sum(PeqNuc) % - useExpectedTransversions % If 1, use the old characterisation of the branch length % % OUTPUT % - substitutionMatrix % Nucleotide substitution matrix, with branch lengths % integrated out. Note that this matrix depends on the % scale hyperparameter mu. % substitutionMatrix(i,j) is the probability of a % mutation from nucleotide i into nucleotide j: i->j. % This follows the notation in HKY85. % The matrix is the transpose of the matrix in my book. % % INVOCATION % substitutionMatrix=Barge_NucSubstMatrix(mu,ttratio,pA,pC,pG) if(~isscalar(mu)) mu error('mu non-scalar'); end pA= PeqNuc(1); pC= PeqNuc(2); pG= PeqNuc(3); pT= 1-pA-pC-pG; if pT<=0 error('Wrong equilibrium frequencies for nucleotides') end pY= pT+pC; pR= pA+pG; % HKY85, equation (5a); modified, as discussed above % (dividing the original expressions by 4*beta). % Also, I have changed the sign lambda(1)= 0; lambda(2)= 1; lambda(3)= pY + pR*ttratio; lambda(4)= pY*ttratio + pR; if(~useExpectedTransversions) Z = 1/(2*ttratio*(pC*pT+pA*pG)+2*(pA+pG)*(pC+pT)); lambda = lambda * Z; end % HKY85, equation (5b) v(:,1)= [pT; pC; pA; pG]; v(:,2)= [pR*pT; pR*pC; -pY*pA; -pY*pG]; v(:,3)= [0; 0; 1; -1]; v(:,4)= [1; -1; 0; 0]; % HKY85, equation (5c) u(:,1)= [1; 1; 1; 1]; u(:,2)= [1/pY; 1/pY; -1/pR; -1/pR]; u(:,3)= [0; 0; pG/pR; -pA/pR]; u(:,4)= [pC/pY; -pT/pY; 0; 0]; % In HKY, the order of the nucleotides is T,C,A,G. % I transform this to the alphabetic order A,C,G,T. x= v; v(1,:)= x(3,:); v(3,:)= x(4,:); v(4,:)= x(1,:); clear x; x= u; u(1,:)= x(3,:); u(3,:)= x(4,:); u(4,:)= x(1,:); % Computing the nucleotide substitution matrix substitutionMatrix= zeros(4); for i=1:4 factor= 1/(mu*lambda(i)+1); newTerm= factor*u(:,i)*v(:,i)'; substitutionMatrix= substitutionMatrix+newTerm; end % Ensure rounding errors don't cause nonsensical values. substitutionMatrix(find(substitutionMatrix < 0)) = 0; substitutionMatrix(find(substitutionMatrix > 1)) = 1; if(~all(all(substitutionMatrix >= 0 & substitutionMatrix <= 1))) mu ttratio PeqNuc substitutionMatrix error('Calculated substitution matrix is not a valid probability matrix'); end if(nargout > 1) rateMatrix= zeros(4); for i=1:4 rateMatrix= rateMatrix-lambda(i)*u(:,i)*v(:,i)'; end end