Document details for 'Dealing with distances and transformations for fuzzy C-means clustering of compositional data'

Authors Palarea Albaladejo, J., Martin-Fernandez, J.A. and Soto, J.A.
Publication details Journal of Classification 29(2), 144-169. Springer Verlag.
Publisher details Springer Verlag
Keywords Fuzzy clustering; FCM; compositional data; closed data; simplex space; Aitchison distance
Abstract Clustering techniques are based upon a dissimilarity or distance measure between objects and clusters. This paper focuses on the simplex space, whose elements (compositions) are subject to non-negativity and constant-sum constraints. Any data analysis involving compositions should fulfill two main principles: scale invariance and subcompositional coherence. Among fuzzy clustering methods, the FCM algorithm is broadly applied in a variety of elds, but it is not well-behaved when dealing with compositions. Here, the adequacy of different dissimilarities in the simplex, together with the behavior of the common log-ratio transformations, is discussed in the basis of compositional principles. As a result, a well-founded strategy for FCM clustering of compositions is suggested. Theoretical findings are accompanied by numerical evidence, and a detailed account of our proposal is provided. Finally, a case study is illustrated using a nutritional data set known in the clustering literature.
Last updated 2015-09-23

Unless explicitly stated otherwise, all material is copyright © Biomathematics and Statistics Scotland.

Biomathematics and Statistics Scotland (BioSS) is formally part of The James Hutton Institute (JHI), a registered Scottish charity No. SC041796 and a company limited by guarantee No. SC374831. Registered Office: JHI, Invergowrie, Dundee, DD2 5DA, Scotland