Robust regression with compositional covariates including cellwise outliers

Publication Name
Book of Abstracts of the 8th International Workshop on Compositional Data Analysis (CoDaWork2019)
Publisher
Universitat Politecnica de Catalunya-BarcelonaTECH
ISBN
978-84-947240-1-5
Abstract
Multivariate data are commonly arranged as a rectangular matrix with observations or cases in the rows and variables in the columns. Ordinary robust estimators are designed to deal with rowwise outliers, assuming that most observations are free of contamination. However, this approach may lead to a significant loss of information in situations where outliers, not necessarily many, occur at the individual cell level but affect a large fraction of observations (Alqallaf et al., 2009). Moreover, additional problems are confronted when data of compositional nature are involved. In this case, the relevant information for statistical analysis is contained in the ratios between parts of the composition (Pawlowsky-Glahn et al., 2015; Filzmoser et al., 2018). Then, cellwise contamination in these easily propagates throughout and distorts the results. In this contribution, a robust regression estimation method is proposed for compositional and real- valued explanatory variables from data matrices including outliers in their cells (cellwise), and possibly also entire outlying observations (rowwise). Cellwise outliers are first filtered and then imputed by robust estimates. Afterwards, robust compositional regression is performed to obtain the model parameters. Imputation uncertainty is reflected on regression coefficient estimates via a multiple imputation (MI) scheme (Rubin and Schenker, 1986). Simulations show that the proposed procedure generally outperforms traditional rowwise-only robust regression estimators like the MM-estimator (Maronna et al., 2006), as well as other recently proposed cellwise robust regression methods. An application to bio-environmental data reveals that the new proposal contributes to draw conclusions which are more consistent with scientific knowledge than the common robust MM regression.
Year
2019
Category
Book Chapter