Date of Award


Degree Name

Doctor of Philosophy


Computer Science

First Advisor

Dr. Dionysios I. Kountanis

Second Advisor

Dr. Ala Al-Fuqaha

Third Advisor

Dr. Leszek Lilien

Fourth Advisor

Dr. Liang Dong


reinforcement learning, multiple-agent systems, cognitive radio networks, Markov games


The objective of reinforcement learning in multiple-agent systems is to find an efficient learning method for the agents to behave optimally. Finding Nash equilibrium has become the common learning target for the optimality. However, finding Nash equilibrium is a PPAD (Polynomial Parity Arguments on Directed graphs)-complete problem. The conventional methods can find Nash equilibrium for some special types of Markov games.

This dissertation proposes a new reinforcement learning algorithm to improve the search efficiency and effectiveness for multiple-agent systems. This algorithm is based on the definition of Nash equilibrium and utilizes the greedy and rational features of the agents. When the agents adjust their behavior strategies following certain rules based on the feedback, their behavior strategies display special patterns. The special patterns are tightly related to the Nash equilibrium. The agents can find their Nash equilibrium strategies according to the patterns' properties even though each of the agents doesn't have information about the other agents.

The new reinforcement learning algorithm can be applied in many areas as long as the target problem can be mapped to a Markov game.We apply the learning algorithm to solve the spectrum sharing problem in cognitive radio networks.

There are several contributions of this research. First, the proposed reinforcement learning algorithm for multiple-agent systems doesn't require the agents to have information about other agents. Second, our learning algorithm is more efficient than other similar learning algorithms. Third, the learning algorithm also effectively finds a Nash equilibrium. Fourth, the learning algorithm attempts to impact the learning problem in multiple-state Markov games. The algorithm is expected to be extended to the scalable Markov games. Last but not least, this reinforcement learning algorithm can be applied in many areas. We have applied the learning algorithm to find solution to the spectrum sharing problem in a cognitive radio network model demonstrated in the dissertation.

Access Setting

Dissertation-Open Access