Document details for 'Weighted pivot coordinates for PLS-based marker discovery in high-throughput compositional data'

Authors Štefelová, N., Palarea Albaladejo, J. and Hron, K.
Publication details Statistical Analysis and Data Mining: The ASA Data Science Journal 14, 315-330. Wiley.
Publisher details Wiley
Keywords Compositional data, high-throughput data, log-ratio analysis, marker discovery, PLS regression
Abstract High-throughput data representing large mixtures of chemical or biological sig- nals are ordinarily produced in the molecular sciences. Given a number of samples, partial least squares (PLS) regression is a well-established statistical method to investigate associations between them and any continuous response variables of interest. However, technical artifacts generally make the raw signals not directly comparable between samples. Thus, data normalization is required before any meaningful scientific information can be drawn. This often allows to characterize the processed signals as compositional data where the relevant information is contained in the pairwise log-ratios between the components of the mixture. The (log-ratio) pivot coordinate approach facilitates the aggrega- tion into single variables of the pairwise log-ratios of a component to all the remaining components. This simplifies interpretability and the investigation of their relative importance but, particularly in a high-dimensional context, the aggregated log-ratios can easily mix up information from different underlaying processes. In this context, we propose a weighting strategy for the construction of pivot coordinates for PLS regression which draws on the correlation between response variable and pairwise log-ratios. Using real and simulated data sets, we demonstrate that this proposal enhances the discovery of biological markers in high-throughput compositional data.
Last updated 2021-07-05

Unless explicitly stated otherwise, all material is copyright © Biomathematics and Statistics Scotland.

Biomathematics and Statistics Scotland (BioSS) is formally part of The James Hutton Institute (JHI), a registered Scottish charity No. SC041796 and a company limited by guarantee No. SC374831. Registered Office: JHI, Invergowrie, Dundee, DD2 5DA, Scotland