The relative performance of AIC, AICC and BIC in the presence of unobserved heterogeneity

Abstract
Model selection is difficult. Even in the apparently straightforward case of choosing between standard linear regression models, there does not yet appear to be consensus in the statistical ecology literature as to the right approach. We review recent works on model selection in ecology and subsequently focus on one aspect in particular: the use of the Akaike Information Criterion (AIC) or its small-sample equivalent, AICC. We create a novel framework for simulation studies and use this to study model selection from simulated data sets with a range of properties, which differ in terms of degree of unobserved heterogeneity. We use the results of the simulation study to suggest an approach for model selection based on ideas from information criteria but requiring simulation. We find that the relative predictive performance of model selection by different information criteria is heavily dependent on the degree of unobserved heterogeneity between data sets. When heterogeneity is small, AIC or AICC are likely to perform well, but if heterogeneity is large, the Bayesian Information Criterion (BIC) will often perform better, due to the stronger penalty afforded. Our conclusion is that the choice of information criterion (or more broadly, the strength of likelihood penalty) should ideally be based upon hypothesized (or estimated from previous data) properties of the population of data sets from which a given data set could have arisen. Relying on a single form of information criterion is unlikely to be universally successful.
Year
2016
Category
Refereed journal
Output Tags
Species Distribution Modelling
Theme 3 - Land Use
WP3.4 - Resilience of Scotland's biodiversity to change