Finite-size effects in Bayesian model selection and generalization

Abstract
We show that in supervised learning from a supplied data set Bayesian model selection, based on the evidence, does not optimise generalisation performance even for a learnable linear problem. This is demonstrated by examining the finite size effects in hyperparameter assignment from the evidence procedure and the resultant generalisation performance. Our approach demonstrates the weakness of average case and asymptotic analyses. Using simulations we corroborate our analytic results and examine an alternative model selection criterion, namely cross-validation. This numerical study shows that the cross-validation hyperparameter estimates correlate more strongly than those of the evidence with optimal performance. However, we show that for a sufficiently large input dimension the evidence procedure could provide a reliable alternative to the more computationally expensive cross-validation.
Year
1996
Category
Refereed journal