Date of Award

6-2022

Degree Name

Doctor of Philosophy

Department

Statistics

First Advisor

Hyun Bin Kang, Ph.D.

Second Advisor

Kevin H Lee, Ph.D.

Third Advisor

Joshua Naranjo, Ph.D.

Fourth Advisor

Sangwoo Lee, Ph.D.

Keywords

Alcoholism, EEG, expectation-maximization, functional data analysis, functional glasoo, mixture of functional graphical models

Abstract

With the development of data collection technologies that use powerful monitoring devices and computational tools, many scientific fields are now obtaining more detailed and more complicatedly structured data, e.g., functional data. This leads to increasing challenges of extracting information from the large complex data. Making use of these data to gain insight into complex phenomena requires characterizing the relationships among a large number of functional variables. Functional data analysis (FDA) is a rapidly developing area of statistics for data which can be naturally viewed as a smooth curve or function. It is a method that changes the frame of data and thus the fundamental statistical unit is now a function or curve, other than the vector of measurements. Graphical models have been widely used to explicitly capture the statistical relationships between the variables of interest in the form of a graph. The central question in these models is to infer significant conditional dependencies or independencies from high-dimensional data. In the current literature, it is common to assume that the high-dimensional data come from a homogeneous source and follow a parametric or semi-parametric graphical model. However, in real-world context the observed data often come from different sources and may have heterogeneous dependencies across the whole population. Therefore, a single functional graphical model is no longer adequate for the data. As finite mixture models offer powerful statistical techniques to identify subpopulations with certain commonality within an overall population from heterogeneous sources, one solution to this issue may be the application of mixture analysis techniques in functional graphical models.

As a part of such effort, a functional graphical model is developed to extract the conditional dependence structure among random functions. In this dissertation, we propose the mixture of functional graphical models (MFGM), which detects the heterogeneous subgroups of the population and estimates the conditional dependencies in each subgroup. We also introduce an estimation method for MFGM using an iterative Expectation-Maximization (EM) algorithm.

The performance of our algorithm on the overall clustering accuracy and accuracy of the estimation for the conditional dependence structures in the heterogeneous subgroups is shown through the simulation studies. Our MFGM algorithm outperforms the two potential competing algorithms: the alternating direction method of multipliers (ADMM) algorithm provided by the R fgm package that assumes partial separability in the proposed functional Gaussian graphical models, and the mixggm algorithm that implements mixture of Gaussian graphical models in multivariate vector setting. The application to high-dimensional electroencephalography (EEG) dataset taken from an alcoholism study is also discussed. The results from the real data analysis also corroborate the performance of our MFGM algorithm.

Our work, motivated by inferring heterogeneous conditional dependencies of high dimensional data, may greatly extend the methodology and applicability of high dimensional graphical models, and provide a novel strategy for complex functional data analysis.

Access Setting

Dissertation-Open Access

Share

COinS