Weighting of parts in compositional data analysis: advances and applications

It often occurs in practice that it is sensible to give different weights to the variables involved in a multivariate data analysis-and the same holds for compositional data as multivariate observations carrying relative information. It can be convenient to apply weights to better accommodate differences in the quality of the measurements, the occurrence of zeros and missing values, or generally to highlight some specific features of compositional parts. The characterisation of compositional data as elements of a Bayes space, which is as a natural generalisation of the ordinary Aitchison geometry, enables the definition of a formal framework to implement weighting schemes for the parts of a composition. This is formally achieved by considering a reference measure in the Bayes space alternative to the common uniform measure via the well-known chain rule. Unweighted centred logratio (clr) coefficients and isometric logratio (ilr) coordinates then allow us to express compositions in real space equipped with (unweighted) Euclidean geometry. The resulting elements of real space generated by the clr coefficients or ilr coordinates are invariant to the scale of the original compositions, but the actual scale of the weights matters. In this work, these formal developments are presented and used to introduce a general approach for weighting parts in compositional data analysis. The practical use is demonstrated on simulated and real-world data sets in the context of the earth sciences.
Refereed journal