RAND Statistics Seminar Series

Clustering Functional Data: Methods and Applications

Presented by Catherine Sugar, USC Marshall School of Business
Thursday, December 14, 2006
RAND Corporation, Santa Monica, CA
Methods for functional data analysis (FDA), in which measurements for subjects consist of curves or trajectories rather than finite-dimensional points, are becoming increasingly important in many fields. In this talk I present a flexible model-based procedure for clustering functional data. The techniques can be applied to all types of curve data but are particularly useful when individuals are observed at a sparse set of time points. In addition to producing final cluster assignments, the procedure generates predictions and confidence intervals for missing portions of curves. The resulting models can be assessed visually via low dimensional representations of the curves, and the regions of greatest separation between clusters can be determined using a discriminant function. The basic model can be extended to handle multiple functional and finite dimensional covariates and can be applied to standard finite dimensional clustering problems involving missing data.

I will illustrate the techniques using a variety of applications to medical and business problems.

