A Pseudo-EM Algorithm for Clustering Incomplete...

A Pseudo-EM Algorithm for Clustering Incomplete Longitudinal Data

Abstract

A method for clustering incomplete longitudinal data, and gene expression time course data in particular, is presented. Specifically, an existing method that utilizes mixtures of multivariate Gaussian distributions with modified Cholesky-decomposed covariance structure is extended to accommodate incomplete data. Parameter estimation is carried out in a fashion that is similar to an expectation-maximization algorithm. We focus on the particular application of clustering incomplete gene expression time course data. In this application, our approach gives good clustering performance when compared to the results when there is no missing data. Possible extensions of this work are also suggested.

Authors

Shaikh M; McNicholas PD; Desmond AF

Journal

The International Journal of Biostatistics, Vol. 6, No. 1, pp. article–8

Publisher

De Gruyter

Publication Date

April 13, 2010

DOI

10.2202/1557-4679.1223

ISSN

2194-573X

Associated Experts

Paul McNicholas

Professor, Faculty of Science

Visit profile

Labels