Document details for 'Robust regression with compositional covariates including cellwise outliers'

Authors Štefelová, N., Alfons, A., Palarea Albaladejo, J., Filzmoser, P. and Hron, K.
Publication details In "Book of Abstracts of the 8th International Workshop on Compositional Data Analysis (CoDaWork2019)", 79. Eds. M.I. Ortego. Universitat Politecnica de Catalunya-BarcelonaTECH, Barcelona, Spain.
Publisher details Universitat Politecnica de Catalunya-BarcelonaTECH, Barcelona, Spain
Abstract Multivariate data are commonly arranged as a rectangular matrix with observations or cases in the rows and variables in the columns. Ordinary robust estimators are designed to deal with rowwise outliers, assuming that most observations are free of contamination. However, this approach may lead to a significant loss of information in situations where outliers, not necessarily many, occur at the individual cell level but affect a large fraction of observations (Alqallaf et al., 2009). Moreover, additional problems are confronted when data of compositional nature are involved. In this case, the relevant information for statistical analysis is contained in the ratios between parts of the composition (Pawlowsky-Glahn et al., 2015; Filzmoser et al., 2018). Then, cellwise contamination in these easily propagates throughout and distorts the results. In this contribution, a robust regression estimation method is proposed for compositional and real- valued explanatory variables from data matrices including outliers in their cells (cellwise), and possibly also entire outlying observations (rowwise). Cellwise outliers are first filtered and then imputed by robust estimates. Afterwards, robust compositional regression is performed to obtain the model parameters. Imputation uncertainty is reflected on regression coefficient estimates via a multiple imputation (MI) scheme (Rubin and Schenker, 1986). Simulations show that the proposed procedure generally outperforms traditional rowwise-only robust regression estimators like the MM-estimator (Maronna et al., 2006), as well as other recently proposed cellwise robust regression methods. An application to bio-environmental data reveals that the new proposal contributes to draw conclusions which are more consistent with scientific knowledge than the common robust MM regression.
ISBN 978-84-947240-1-5
Last updated 2019-08-06

Unless explicitly stated otherwise, all material is copyright © Biomathematics and Statistics Scotland.

Biomathematics and Statistics Scotland (BioSS) is formally part of The James Hutton Institute (JHI), a registered Scottish charity No. SC041796 and a company limited by guarantee No. SC374831. Registered Office: JHI, Invergowrie, Dundee, DD2 5DA, Scotland