Near-Minimax-Optimal Distributional RL with a Generative Model

Published: 12 February 2024

Abstract

We propose a new algorithm for model-based distributional reinforcement learning (RL), and prove that it is minimax-optimal for approximating return distributions with a generative model (up to logarithmic factors), resolving an open question of Zhang et al. (2023). Our analysis provides new theoretical results on categorical approaches to distributional RL, and also introduces a new distributional Bellman equation, the stochastic categorical CDF Bellman equation, which we expect to be of independent interest. We also provide an experimental study comparing several model-based distributional RL algorithms, with several takeaways for practitioners.

Authors

Mark Rowland, Kevin Li, Remi Munos, Clare Lyle, Yunhao Tang, Will Dabney

Venue

arXiv

Gemini

Gemma

Generative models

Experiments

Projects

Publications

News

AI for biology

AI for climate and sustainability

AI for mathematics and computer science

AI for physics and chemistry

AI transparency

News

Careers

Milestones

Education

Responsibility

The Podcast

Near-Minimax-Optimal Distributional RL with a Generative Model

Abstract

Authors

Venue

Near-Minimax-Optimal Distributional RL with a Generative Model

Share

Abstract

Authors

Venue