Weighting the domain of probability densities in functional data analysis

Publisher

Springer

Abstract

In functional data analysis some region(s) of the domain of the functions can be of more interest than others due to the quality of measurement, relative scale of the domain, or simply due to some external reason (e.g., interest of stakeholders). Weighting the domain is of interest particularly with probability density functions (PDFs), as derived from distributional data, which often aggregate measurements of different quality or are affected by scale effects. A weighting scheme can be embedded into the underlying sample space of a PDF when they are considered as continuous compositions applying the theory of Bayes spaces. The origin of a Bayes space is determined by a given reference measure and this can be easily changed through the well‐known chain rule. This work provides a formal framework for defining weights through a reference measure and it is used to develop a weighting scheme on the bounded domain of distributional data. The impact on statistical analysis is illustrated through an application to functional principal component analysis of income distribution data. Moreover, a novel centered log‐ratio transformation is proposed to map a weighted Bayes space into an unweighted L2 space, enabling to use most tools developed in functional data analysis (e.g. clustering, regression analysis, etc.) while accounting for the weighting scheme. The potential of our proposal is shown on a real case study using Italian income data.

Year

2020