Gemini Diffusion

Our state-of-the-art, experimental text diffusion model

Large-language models are the foundation of generative AI today. We’re using a technique called diffusion to explore a new kind of language model that gives users greater control, creativity, and speed in text generation.

What is a diffusion model?

Traditional autoregressive language models generate text one word – or token – at a time. This sequential process can be slow, and limit the quality and coherence of the output.

Diffusion models work differently. Instead of predicting text directly, they learn to generate outputs by refining noise, step-by-step. This means they can iterate on a solution very quickly and error correct during the generation process. This helps them excel at tasks like editing, including in the context of math and code.

Capabilities

  • Rapid response

    Generates content significantly faster than even our fastest model so far.

  • More coherent text

    Generates entire blocks of tokens at once, meaning it responds more coherently to a user’s prompt than autoregressive models.

  • Iterative refinement

    Corrects errors during generation for more consistent outputs.

Benchmarks

Gemini Diffusion’s external benchmark performance is comparable to much larger models, whilst also being faster.

Benchmark
Gemini Diffusion
Gemini 2.0 Flash-Lite
Code LiveCodeBench (v6)
30.9% 28.5%
Code BigCodeBench
45.4% 45.8%
Code LBPP (v2)
56.8% 56.0%
Code SWE-Bench Verified*
22.9% 28.5%
Code HumanEval
89.6% 90.2%
Code MBPP
76.0% 75.8%
Science GPQA Diamond
40.4% 56.5%
Mathematics AIME 2025
23.3% 20.0%
Reasoning BIG-Bench Extra Hard
15.0% 21.0%
Multilingual Global MMLU (Lite)
69.1% 79.0%

Gemini Diffusion speed

Sampling speed excluding overhead
1479 tokens / sec
Overhead
0.84 sec

Try Gemini Diffusion

Gemini Diffusion is currently available as an experimental demo to help develop and refine future models. If you're interested in getting access to the demo, please sign up to the waitlist.