A latent Gaussian model for compositional data with zeros (with publisher's corrigendum)

Abstract
Compositional data record the relative proportions of different components within a mixture, and arise frequently in many fields. Standard statistical techniques for the analysis of such data assume the absence of proportions which are genuinely zero. However, real data can contain a substantial number of zero values. We present a latent Gaussian model for the analysis of compositional data which contain zero values, which is based on assuming that the data arise from a (deterministic) Euclidean projection of a multivariate Gaussian random variable onto the unit simplex. We propose an iterative algorithm to simulate values from this model, and apply the model to data on the proportions of fat, protein and carbohydrate in different groups of food products. Finally, evaluation of the likelihood involves the calculation of difficult integrals if the number of components is more than three, so we present a Gibbs sampling sampling algorithm that can be used to draw inferences about the parameters of the model when the number of components is arbitrarily large.
Year
2008
Category
Refereed journal
Output Tags
SG 2006-2011 P4 Human Health - Miscellaneous