A LASSO-penalized BIC for mixture model selection
- Additional Document Info
- View All
The efficacy of family-based approaches to mixture model-based clustering and
classification depends on the selection of parsimonious models. Current wisdom
suggests the Bayesian information criterion (BIC) for mixture model selection.
However, the BIC has well-known limitations, including a tendency to
overestimate the number of components as well as a proclivity for, often
drastically, underestimating the number of components in higher dimensions.
While the former problem might be soluble through merging components, the
latter is impossible to mitigate in clustering and classification applications.
In this paper, a LASSO-penalized BIC (LPBIC) is introduced to overcome this
problem. This approach is illustrated based on applications of extensions of
mixtures of factor analyzers, where the LPBIC is used to select both the number
of components and the number of latent factors. The LPBIC is shown to match or
outperform the BIC in several situations.
has subject area