Time-aligned Latent Dirichlet Allocation for Longitudinal Microbiome Data
Theses
Overview
Overview
abstract
Microbial data exhibit dynamic characteristics driven by interactions among taxa and with experimental factors. This has led to an increased emphasis on longitudinal studies of microbial data due to their inter-dependencies. Emphasizing the temporal dynamics of microbial communities, rather than of individual taxa, provides valuable insights into the functionality of taxa. In the realm of identifying microbial communities, the probabilistic Latent Dirichlet Allocation (LDA) topic model has gained popularity (Sankaran & Holmes, 2019). This model is particularly applicable for analyzing multivariate, high-dimensional, and sparse data accommodating mixed membership in clusters. This thesis introduces a time-aligned Latent Dirichlet Allocation (LDA), an extension of LDA for longitudinal microbiome data. Drawing inspiration from the work of Wang et al. (2021), our study aims to capture and analyze temporal changes in microbial communities. The proposed time-aligned LDA method was implemented on gut microbial specimens obtained from pregnant women enrolled in the Be Healthy in Pregnancy (BHIP) study, both during pregnancy and at delivery. Subsequently, we conducted a comparative analysis with the traditional LDA approach using the hold-out specimen technique. Utilizing the time-aligned LDA alongside a mixed model, our findings indicates no discernible changes in microbial communities between treatment groups. Notably, the time-aligned LDA exhibited enhances sensitivity in identifying a greater number of microbial communities exhibiting significant temporal dynamics compared to the standard LDA.