Defying the Circadian Rhythm: Clustering Participant Telemetry in the UK Biobank Data
Abstract
The UK Biobank dataset follows over 500,000 volunteers and contains a diverse
set of information related to societal outcomes. Among this vast collection, a
large quantity of telemetry collected from wrist-worn accelerometers provides a
snapshot of participant activity. Using this data, a population of shift
workers, subjected to disrupted circadian rhythms, is analysed using a mixture
model-based approach to yield protective effects from physical activity on
survival outcomes. In this paper, we develop a scalable, standardized, and
unique methodology that efficiently clusters a vast quantity of participant
telemetry. By building upon the work of Doherty et al. (2017), we introduce a
standardized, low-dimensional feature for clustering purposes. Participants are
clustered using a matrix variate mixture model-based approach. Once clustered,
survival analysis is performed to demonstrate distinct lifetime outcomes for
individuals within each cluster. In summary, we process, cluster, and analyse a
subset of UK Biobank participants to show the protective effects from physical
activity on circadian disrupted individuals.