Date of Award

4-2025

Degree Name

Doctor of Philosophy

Department

Statistics

First Advisor

Kevin H. Lee, Ph.D.

Second Advisor

Hyun Bin Kang, Ph.D.

Third Advisor

Yingying Zhang, Ph.D.

Fourth Advisor

Geumchan Hwang, Ph.D.

Abstract

Statistical applications in fields such as bioinformatics, genomics, speech processing, image processing, and communications often involve large-scale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for approaching these problems, and indeed many of the models developed by researchers in these applied fields are instances of the general graphical model formalism. This formalism gives a nice framework for capturing complex dependencies among the random variables and building a large-scale model for high-dimensional data. Recently, high-dimensional data are more assumed to come from one population and follow a parametric or semiparametric graphical model. Whereas in real world applications, it is more common that the observed data come from multiple populations and thus have heterogeneous dependencies across the whole population. Hence, it is crucial to explore the presence of such subpopulations within an overall population and estimate the heterogeneous dependencies in the graph structures. Although there is a rich literature in modeling continuous data with Gaussian graphical models, little has been done toward modeling the discrete data, especially binary-valued data with Ising graphical models. Furthermore, most of the existing work on the graph structure estimation assumes that the relational structure is fixed over time, which might be rarely the case. Therefore, this study is concerned with estimating time-varying graphical structures. In particular, the focus is on learning a nonparametric finite mixture of Ising graphical models to estimate functional parameters and to recover functional heterogeneous graphical edges structure, with the aim of exploring the heterogeneous dependencies of high-dimensional data. The performance of the proposed method is demonstrated via simulation studies and a real-world application.

Access Setting

Dissertation-Open Access

Share

COinS