Date of Award


Degree Name

Doctor of Philosophy


Computer Science

First Advisor

Dr. Steven Carr

Second Advisor

Dr. Alvis Fong

Third Advisor

Dr. Elise deDoncker

Fourth Advisor

Dr. Zijiang James Yang


Predicting, locality, patterns, source code, machine learning


On modern computer systems, the performance of an application depends on its locality. Most existing locality measurements performed by compiler static analysis mainly target analyzing regular array references in loop nests. Measurements based on compiler static analysis have limited applicability when the loop bounds are unknown at compile time, when the control flow is dynamic, or when index arrays or pointer operations are used. In addition, compiler static analysis cannot adapt to input change.

Training-based locality analysis predicts the data reuse change across program inputs to provide run-time information. This analysis quantifies the number of unique memory locations accessed between two references to the same location (reuse distance) as a function of input-data size. Training-based locality analysis is able to predict the locality of an application based on the size of the input data; however, it is costly in time and space because it needs to instrument a few runs for each application.

In this dissertation, we study the prediction of memory usage by an application using machine learning. We combine source-code analysis with training-based locality analysis to construct a supervised learning model parameterized only by the source code properties. This model is the first to be able to predict the upper bound of data reuse change (locality-pattern complexity) at compile time for loop nests in programs without the need to instrument and run the program. We find stable behavior in locality pattern (locality phase) corresponding to loops in the programs. We also introduce a phase detection approach. Our approach builds procedure and loop trees to represent the source code in the program. The phase marker algorithm searches the annotated tree to identify and mark the start of unique stable behaviors in locality (phase-transition points). By evaluating our model on small and large programs, the result shows we have the ability to predict how memory usage grows as a function of the input size efficiently. Also, we evaluate our phase marker and find it identifies the locality phase transition points accurately.

Our work represents a significant step in developing an accurate static memory usage predictor that provide a dynamic memory allocation for use in Virtual Machines (VMs) in cloud data centers to increase the virtual memory resource utilization and guarantee end user Quality of Service (QoS).


Fifth advisor: Dr. Steven Durbin

Access Setting

Dissertation-Open Access