Publication Data
Theoretical Advantages of Lenient Learners in Multiagent Systems
Abstract: This paper presents the dynamics of multiple reinforcement
learning agents from an Evolutionary Game Theoretic perspective. We provide a
Replicator Dynamics model for traditional multiagent Q-learning, and we then extend
these differential equations to account for lenient learners: agents that forgive
possible mistakes of their teammates that resulted in lower rewards. We use this
extended formal model to visualize the basins of attraction of both traditional and
lenient multiagent Q-learners in two benchmark coordination problems. The results
indicate that lenience provides learners with more accurate estimates for the utility
of their actions, resulting in higher likelihood of convergence to the globally optimal
solution. In addition, our research supports the strength of EGT as a backbone for
multiagent reinforcement learning.
