See how Gemini 2.5 Pro Experimental uses its reasoning capabilities to create interactive simulations and do advanced coding.
Watch
Make an interactive animation
See how Gemini 2.5 Pro Experimental uses its reasoning capabilities to create an interactive animation of “cosmic fish” with a simple prompt.
Watch
Create your own dinosaur game
Watch Gemini 2.5 Pro Experimental create an endless runner game, using executable code from a single line prompt.
Watch
Code a fractal visualization
See how Gemini 2.5 Pro Experimental creates a simulation of intricate fractal patterns to explore a Mandelbrot set.
Watch
Plot interactive economic data
Watch Gemini 2.5 Pro Experimental use its reasoning capabilities to create an interactive bubble chart to visualize economic and health indicators over time.
Watch
Animate complex behavior
See how Gemini 2.5 Pro Experimental creates an interactive Javascript animation of colorful boids inside a spinning hexagon.
Watch
Code particle simulations
Watch Gemini 2.5 Pro Experimental use its reasoning capabilities to create an interactive simulation of a reflection nebula.
Performance
Gemini 2.5 is state-of-the-art across a wide range of benchmarks.
Benchmarks
Gemini 2.5 Pro demonstrates significantly improved performance across a wide range of benchmarks.
Benchmark
Gemini 2.5 Pro
Experimental (03-25)
OpenAI o3-mini
High
OpenAI GPT-4.5
Claude 3.7 Sonnet
64k Extended thinking
Grok 3 Beta
Extended thinking
DeepSeek R1
Reasoning & knowledge
Humanity's Last Exam (no tools)
18.8%
14.0%*
6.4%
8.9%
—
8.6%*
Science
GPQA diamond
single attempt (pass@1)
84.0%
79.7%
71.4%
78.2%
80.2%
71.5%
multiple attempts
—
—
—
84.8%
84.6%
—
Mathematics
AIME 2025
single attempt (pass@1)
86.7%
86.5%
—
49.5%
77.3%
70.0%
multiple attempts
—
—
—
—
93.3%
—
Mathematics
AIME 2024
single attempt (pass@1)
92.0%
87.3%
36.7%
61.3%
83.9%
79.8%
multiple attempts
—
—
—
80.0%
93.3%
—
Code generation
LiveCodeBench v5
single attempt (pass@1)
70.4%
74.1%
—
—
70.6%
64.3%
multiple attempts
—
—
—
—
79.4%
—
Code editing
Aider Polyglot
74.0% / 68.6%
whole / diff
60.4%
diff
44.9%
diff
64.9%
diff
—
56.9%
diff
Agentic coding
SWE-bench Verified
63.8%
49.3%
38.0%
70.3%
—
49.2%
Factuality
SimpleQA
52.9%
13.8%
62.5%
—
43.6%
30.1%
Visual reasoning
MMMU
single attempt (pass@1)
81.7%
no MM support
74.4%
75.0%
76.0%
no MM support
multiple attempts
—
no MM support
—
—
78.0%
no MM support
Image understanding
Vibe-Eval (Reka)
69.4%
no MM support
—
—
—
no MM support
Long context
MRCR
128k
91.5%
36.3%
48.8%
—
—
—
1M
83.1%
—
—
—
—
—
Multilingual performance
Global MMLU (Lite)
89.8%
—
—
—
—
—
Building responsibly in the agentic era
As we develop these new technologies, we recognize the responsibility it entails, and aim to prioritize safety and security in all our efforts.
Learn about complex topics with unexpected characters. Understand the science of volcanoes with a sports commentator, or the complexities of mortgages with a pirate. Talk back and forth with the characters.
Learn more about the weather in a particular location. Ask a question to receive a text explainer on what the weather looks like, in the style of a character of your choice.