Exploration at Scale using Epistemic Neural Networks

Published: 1 February 2024

Abstract

We present evidence of substantial benefit to efficient exploration in gathering human feedback to improve large language models. In our experiments, an agent sequentially generates queries while fitting a reward model to the feedback received. Our best-performing agent generates queries using double Thompson sampling, with uncertainty represented by an epistemic neural network. Our results demonstrate that efficient exploration enables high levels of performance with far fewer queries. Further, both uncertainty estimation and the choice of exploration scheme play critical roles.

Authors

Vikranth Dwaracherla, Seyed Mohammad Asghari, Botao Hao, Benjamin Van Roy

Venue

arXiv

Gemini

Gemma

Generative models

Experiments

Projects

Publications

News

AI for biology

AI for climate and sustainability

AI for mathematics and computer science

AI for physics and chemistry

AI transparency

News

Careers

Milestones

Education

Responsibility

The Podcast

Exploration at Scale using Epistemic Neural Networks

Abstract

Authors

Venue

Exploration at Scale using Epistemic Neural Networks

Share

Abstract

Authors

Venue