Introducing our most intelligent model yet. With state-of-the-art reasoning to help you learn, build, and plan anything.

Models

Completing everyday tasks, or solving complex problems. Discover the right model for what you need

Gemini 3.1 Deep Think

Pushes the boundaries of intelligence, delivering a significant upgrade to Gemini 3.1's specialized reasoning mode to help you solve the most complex technical problems.

Gemini 3.1 Deep Think mode can better help tackle real world problems that require rigor, breakthrough creativity and intelligence. Available for Google AI Ultra subscribers.

Gemini 1 introduced native multimodality and long context to help AI understand the world. Gemini 2 added thinking, reasoning and tool use to create a foundation for agents.

Now, Gemini 3 brings these capabilities together – so you can bring any idea to life.


Performance

Gemini 3 is state-of-the-art across a wide range of benchmarks

Our most intelligent model yet sets a new bar for AI model performance

BenchmarkNotesGemini 3.1 Pro Thinking (High)Gemini 3 Pro Thinking (High)Sonnet 4.6 Thinking (Max)Opus 4.6 Thinking (Max)GPT-5.2 Thinking (xhigh)GPT-5.3-Codex Thinking (xhigh)
Humanity's Last Exam Academic reasoning (full set, text + MM) No tools44.4%37.5%33.2%40.0%34.5%
Search (blocklist) + Code 51.4%45.8%49.0%53.1%45.5%
ARC-AGI-2 Abstract reasoning puzzlesARC Prize Verified77.1%31.1%58.3%68.8%52.9%
GPQA Diamond Scientific knowledgeNo tools94.3%91.9%89.9%91.3%92.4%
Terminal-Bench 2.0 Agentic terminal codingTerminus-2 harness68.5%56.9%59.1%65.4%54.0%64.7%
Other best self-reported harness62.2% (Codex)77.3% (Codex)
SWE-Bench Verified Agentic codingSingle attempt80.6%76.2%79.6%80.8%80.0%
SWE-Bench Pro (Public) Diverse agentic coding tasks Single attempt54.2%43.3%55.6%56.8%
LiveCodeBench Pro Competitive coding problems from Codeforces, ICPC, and IOI Elo288724392393
SciCode Scientific research coding59%56%47%52%52%
APEX-Agents Long horizon professional tasks 33.5%18.4%29.8%23.0%
GDPval-AA Elo Expert tasks13171195163316061462
τ2-bench Agentic and tool useRetail90.8%85.3%91.7%91.9%82.0%
Telecom99.3%98.0%97.9%99.3%98.7%
MCP Atlas Multi-step workflows using MCP 69.2%54.1%61.3%59.5%60.6%
BrowseComp Agentic searchSearch + Python + Browse85.9%59.2%74.7%84.0%65.8%
MMMU-Pro Multimodal understanding and reasoning No tools80.5%81.0%74.5%73.9%79.5%
MMMLU Multilingual Q&A92.6%91.8%89.3%91.1%89.6%
MRCR v2 (8-needle) Long context performance128k (average)84.9%77.0%84.9%84.0%83.8%
1M (pointwise)26.3%26.3%Not supportedNot supportedNot supported

Safety

Building with responsibility at the core

As we develop these new technologies, we recognize the responsibility it entails, and aim to prioritize safety and security in all our efforts.


For developers

Build with cutting-edge generative AI models and tools to make AI helpful for everyone

Gemini’s advanced thinking, native multimodality and massive context window empowers developers to build next-generation experiences.


Try Gemini