Gemma 3

A family of lightweight models with multimodal understanding and unparalleled multilingual capabilities for more intelligent applications.

Gemma 3 is the most capable model that can run on a single GPU or TPU. Efficient on workstations, laptops, and even smartphones, allowing developers to build responsible AI applications at scale.


Capabilities

article

Handle complex tasks

Gemma 3's 128K-token context window lets your applications process and understand vast amounts of information, enabling more sophisticated AI features.

translate

Multilingual communication

Unparalleled multilingual capabilities let you communicate effortlessly across countries and cultures. Develop applications that reach a global audience, with support for over 140 languages.

books_movies_and_music

Multimodal understanding

Easily build applications that analyze images, text, and video opening up new possibilities for interactive and intelligent applications.


Model variants

270M

Compact model designed for both task-specific fine-tuning and strong instruction-following.

1B

Lightweight text model, ideal for small applications.

4B

Balanced for performance and flexibility, with multimodal support.

12B

Strong language capabilities, designed for complex tasks.

27B

Enhanced understanding, great for sophisticated applications.



Gemma Quantization-Aware Training (QAT)

Gemma QAT dramatically reduces memory requirements while maintaining high quality. This lets you run powerful models like Gemma 3 27B locally on consumer-grade GPUs like an NVIDIA RTX 3090.