Date of Award
Doctor of Philosophy
Dr. Ala Al-Fuqaha
Dr. Li Yang
Dr. Bilal Khan
Dr. Ammar Rayes
Network fault management has been a vibrant research area in computer networks because of the immediate benefits that it can deliver to network operators and service providers. Unfortunately, most existing fault management systems (to date) target a specific domain. Ideally, collateral damage due to network failures could be mitigated if they could be predicted in advance of their occurrence. Homeostasis could then be more easily maintained by taking corrective measures to avoid imminent failures before they occurred.
Here, we set out with precisely the objective of designing an automated system that is capable of predicting the imminent occurrence of system failures, based on the recent trajectory of measurable system characteristics. In this work, our approach for anomalous behavior mining and fault prediction is based on extracting syntactic and descriptive information about frequent behaviors that precede a failure. Our main goal is to identify general behavior characterizations for each parameter and find its frequency. In our proposed approach, a parameter's behavior; which is represented as a time series signal; is divided into a number of segments in which each segment is characterized based on its local trend only. That is, within each segment we find the maximum fluctuation values, i.e. minimum and maximum values which is called Crest-Trough (C-T) pair. A sequence of C-T pairs is produced for each single segment, then we develop algorithms to find the frequency of each C-T sequence in optimal time and space complexity. The extraction is carried out at multiple timescales, to determine trends within and across multiple network parameters. Based on the collected general form of behaviors, other behaviors are considered a match if they have the same major trends in each corresponding segment and those trends occur in the same sequential order.
Our approach enjoys many advantages over prior approaches. By operating at multiple time scales simultaneously, the new system achieves robustness against unreliable, redundant, incomplete and contradictory information. The algorithms employed operate with low time complexity, making the system scalable, and feasible in real-time environments. Anomalous behaviors identified by the system can be stored efficiently with low space complexity, making it possible to operate with minimal resource requirements even when processing high rate streams of network parameter values.
Restricted to Campus until
Abed, Jondi Hesham, "Developing an Efficient Failure Prediction Model in Support of a Distributed Operating System for Autonomic Networks" (2015). Dissertations. 731.