Abstract
Neural Population Learning (NeuPL) represents diverse strategies in symmetric zero-sum games using a shared conditional neural network. We propose NeuPL-JPSRO, which extends this idea to n-player general-sum games and leverages the convergence guarantees of Joint Policy-Space Response Oracle (JPSRO). We show empirically that NeuPL-JPSRO converges to a Coarse Correlated Equilibrium (CCE) in several OpenSpiel games, as verified by analytical game solvers. We then deploy NeuPL-JPSRO to complex domains where JPSRO with independent RL becomes computationally impractical. We demonstrate how our approach enables adaptive coordination with co-players and transfer learning of skills in larger games. Our work shows that equilibrium convergent population learning can be implemented at scale and in generality, paving the way towards solving real-world games between heterogeneous players with mixed motives.
Authors
Siqi Liu, Luke Marris, Marc Lanctot, Georgios Piliouras, Joel Leibo, Nicolas Heess
Venue
AAMAS 2024