Detecting British Columbia Coastal Rainfall Patterns by Clustering Gaussian Processes
Abstract
Functional data analysis is a statistical framework where data are assumed to
follow some functional form. This method of analysis is commonly applied to
time series data, where time, measured continuously or in discrete intervals,
serves as the location for a function's value. Gaussian processes are a
generalization of the multivariate normal distribution to function space and,
in this paper, they are used to shed light on coastal rainfall patterns in
British Columbia (BC). Specifically, this work addressed the question over how
one should carry out an exploratory cluster analysis for the BC, or any
similar, coastal rainfall data. An approach is developed for clustering
multiple processes observed on a comparable interval, based on how similar
their underlying covariance kernel is. This approach provides interesting
insights into the BC data, and these insights can be framed in terms of El
Ni\~{n}o and La Ni\~{n}a; however, the result is not simply one cluster
representing El Ni\~{n}o years and another for La Ni\~{n}a years. From one
perspective, the results show that clustering annual rainfall can potentially
be used to identify extreme weather patterns.