Gemini 2.5 models are capable of reasoning through their thoughts before responding, resulting in enhanced performance and improved accuracy.

Hands-on with Gemini 2.5

See how Gemini 2.5 uses its reasoning capabilities to create interactive simulations and do advanced coding.

Adaptive and budgeted thinking

Adaptive controls and adjustable thinking budgets allow you to balance performance and cost.

  • Calibrated

    The model explores diverse thinking strategies, leading to more accurate and relevant outputs.

  • Controllable

    Developers have fine-grained control over the model's thinking process, allowing them to manage resource usage.

  • Adaptive

    When no thinking budget is set, the model assesses the complexity of a task and calibrates the amount of thinking accordingly.

Performance

Gemini 2.5 is state-of-the-art across a wide range of benchmarks.

View 2.5 tech report

Benchmarks

In addition to its strong performance on academic benchmarks, Gemini 2.5 tops the popular coding leaderboard WebDev Arena.

Benchmark
Gemini 2.5
Flash-Lite
Preview 06-17
Non-thinking
Gemini 2.5
Flash-Lite
Preview 06-17
Thinking
Gemini 2.5
Flash
Non-thinking
Gemini 2.5
Flash
Thinking
View 2.5 Flash
Gemini 2.5
Pro
Thinking
View 2.5 Pro
Input price
$/1M tokens
(no caching)
$0.10 $0.10 $0.30 $0.30 $1.25
$2.50 > 200k tokens
Output price
$/1M tokens $0.40 $0.40 $2.50 $2.50 $10.00
$15.00 > 200k tokens
Reasoning & knowledge Humanity's Last Exam (no tools)
5.1% 6.9% 8.4% 11.0% 21.6%
Science GPQA diamond
64.6% 66.7% 78.3% 82.8% 86.4%
Mathematics AIME 2025
49.8% 63.1% 61.6% 72.0% 88.0%
Code generation LiveCodeBench (UI: 1/1/2025-5/1/2025)
33.7% 34.3% 41.1% 55.4% 69.0%
Code editing Aider Polyglot
26.7%
27.1%
44.0%
56.7%
82.2%
Agentic coding SWE-bench Verified
single attempt 31.6% 27.6% 50.0% 48.9% 59.6%
   
multiple attempts 42.6% 44.9% 60.0% 60.3% 67.2%
Factuality SimpleQA
10.7% 13.0% 25.8% 26.9% 54.0%
Factuality FACTS grounding
84.1% 86.8% 83.4% 85.3% 87.8%
Visual reasoning MMMU
72.9% 72.9% 76.9% 79.7% 82.0%
Image understanding Vibe-Eval (Reka)
51.3% 57.5% 66.2% 65.4% 67.2%
Long context MRCR v2 (8-needle)
128k (average) 16.6% 30.6% 34.1% 54.3% 58.0%
   
1M (pointwise) 4.1% 5.4% 16.8% 21.0% 16.4%
Multilingual performance Global MMLU (Lite)
81.1% 84.5% 85.8% 88.4% 89.2%

Building responsibly in the agentic era

As we develop these new technologies, we recognize the responsibility it entails, and aim to prioritize safety and security in all our efforts.

Learn more

For developers

Gemini’s advanced thinking, native multimodality and massive context window empowers developers to build next-generation experiences.

Start building

Developer ecosystem

Build with cutting-edge generative AI models and tools to make AI helpful for everyone.

Accessing our latest AI models

We want developers to gain access to our models as quickly as possible. We’re making these available through Google AI Studio.

Sign in to Google AI Studio

Get the latest updates

Sign up for news on the latest innovations from Google DeepMind.

Gemini Flash