Home
Scholarly Works
Serial and parallel implementations of model-based...
Journal article

Serial and parallel implementations of model-based clustering via parsimonious Gaussian mixture models

Abstract

Model-based clustering using a family of Gaussian mixture models, with parsimonious factor analysis like covariance structure, is described and an efficient algorithm for its implementation is presented. This algorithm uses the alternating expectation-conditional maximization (AECM) variant of the expectation-maximization (EM) algorithm. Two central issues around the implementation of this family of models, namely model selection and convergence criteria, are discussed. These central issues also have implications for other model-based clustering techniques and for the implementation of techniques like the EM algorithm, in general. The Bayesian information criterion (BIC) is used for model selection and Aitken’s acceleration, which is shown to outperform the lack of progress criterion, is used to determine convergence. A brief introduction to parallel computing is then given before the implementation of this algorithm in parallel is facilitated within the master–slave paradigm. A simulation study is then carried out to confirm the effectiveness of this parallelization. The resulting software is applied to two datasets to demonstrate its effectiveness when compared to existing software.

Authors

McNicholas PD; Murphy TB; McDaid AF; Frost D

Journal

Computational Statistics & Data Analysis, Vol. 54, No. 3, pp. 711–723

Publisher

Elsevier

Publication Date

March 1, 2010

DOI

10.1016/j.csda.2009.02.011

ISSN

0167-9473

Contact the Experts team