Advancing the Use of Machine Learning Tools for Genome Data in Predicting Bipolar Disorder

Date of Award


Degree Name

Doctor of Philosophy


Industrial and Entrepreneurial Engineering and Engineering Management

First Advisor

Dr. Timothy Greene

Second Advisor

Dr. Steven Butt

Third Advisor

Dr. James Burns

Fourth Advisor

Dr. Mark Kanzawa


Machine learning tools, gene expression, bipolar disorder, diagnostic biomarker, messenger RNA, synthetic dataset


In healthcare, machine learning models are continually being incorporated into diagnostic decision-making known as clinical decision support models using genomic datasets. These genomic datasets show wide-scale genomic variation that can be used to accurately predict bipolar disorder. While integrating genome datasets into machine learning has brought several opportunities to accurately predict bipolar disorder, the accuracy of machine learning models depends critically on the availability of high-quality training datasets from large enough sample sizes. While a reasonable amount of research has been performed to accurately predict bipolar disorder, the genome sample size is small. In addition, most single nucleotide polymorphisms (SNPs) discovered via genomewide association studies (GWAS) have small effects on disorders and thus may not be suitable for improving health care through genetic testing. Three challenges identified from extensive literature review and the proposed solutions have been presented as a Research Effort I, II, and III. The focuses of these research efforts are mainly on a) generating synthetic dataset that can be used to incorporate training datasets without the need to use the real dataset into machine learning, b) develop a model to select genes related to bipolar disorder to precisely classify bipolar disorder patients and evaluate the performances of different model approaches to predict bipolar disorder, and c) develop models that precisely classify bipolar disorder patients. The goal of this study is to overcome these challenges by developing an ensemble model for clinical decision support model.


Fifth Advisor: Dr. Will Fitzgerald

Access Setting

Dissertation-Abstract Only

Restricted to Campus until


This document is currently not available here.