Gemma Scope

A set of interpretability tools built to help researchers understand the inner workings of Gemma 2.

Examine the behavior of individual model layers, while the model processes requests — to help address critical concerns including hallucinations, biases, and manipulation.


Capabilities

Gemma Scope provides researchers with a suite of sparse autoencoders. Think of these as microscopes that let you zoom in on dense, compressed activations, and expand them to larger, sparser, more interpretable forms.

grading

Perform mechanistic interpretability research

Evaluate the precise behavior of Gemma 2 models with layer-level analysis.

tune

Debug model behavior

Pinpoint the source of specific model issues (such as biases and hallucinations) by examining layer-specific representations.


Download Gemma Scope