Revisiting Dynamic Evaluation:Online Adaptation for LLMs

Published: 7 December 2023

Abstract

We revisit dynamic evaluation, the idea of online adapting the parameters of a language model with gradient descent on a given sequence of test tokens. While it is generally known that adapting the parameters at test-time improves the overall predictive performance, we pay particular attention to the speed of adaptation (in terms of sample efficiency) and computational overhead for performing gradient computation and parameter updates.

Authors

Amal Rannen-Triki, Jörg Bornschein, Alexandre Galashov, Razvan Pascanu, Michalis Titsias, Marcus Hutter, Andras Gyorgy, Yee Whye Teh

Venue

DistShift-NeurIPS 23

Gemini

Gemma

Generative models

Experiments

Projects

Publications

News

AI for biology

AI for climate and sustainability

AI for mathematics and computer science

AI for physics and chemistry

AI transparency

News

Careers

Milestones

Education

Responsibility

The Podcast

Revisiting Dynamic Evaluation:Online Adaptation for LLMs

Abstract

Authors

Venue

Revisiting Dynamic Evaluation:Online Adaptation for LLMs

Share

Abstract

Authors

Venue