Gemini 3 Pro

Best for agentic and vibe coding to bring creative concepts to life

Learn, plan, and build like never before with Gemini 3 Pro’s incredible reasoning powers

Our most intelligent model yet

Partner with a pro

With state-of-the-art reasoning and multimodal capabilities

Learn anything

Understand complex topics in a way that makes sense for you – with clear, concise, and helpful responses

Build anything

Bring your ideas to life – from sketches and prompts to interactive tools and experiences

Plan anything

Delegate tasks and multi-step projects to get things done faster than ever before

Get started

Build with Gemini 3

Benchmark Notes Gemini 3 Pro Gemini 2.5 Pro Claude Sonnet 4.5 GPT-5.1
Academic reasoning Humanity's Last Exam No tools 37.5% 21.6% 13.7% 26.5%
With search and code execution 45.8%
Visual reasoning puzzles ARC-AGI-2 ARC Prize Verified 31.1% 4.9% 13.6% 17.6%
Scientific knowledge GPQA Diamond No tools 91.9% 86.4% 83.4% 88.1%
Mathematics AIME 2025 No tools 95.0% 88.0% 87.0% 94.0%
With code execution 100.0% 100.0%
Challenging Math Contest problems MathArena Apex 23.4% 0.5% 1.6% 1.0%
Multimodal understanding and reasoning MMMU-Pro 81.0% 68.0% 68.0% 76.0%
Screen understanding ScreenSpot-Pro 72.7% 11.4% 36.2% 3.5%
Information synthesis from complex charts CharXiv Reasoning 81.4% 69.6% 68.5% 69.5%
OCR OmniDocBench 1.5 Overall Edit Distance, lower is better 0.115 0.145 0.145 0.147
Knowledge acquisition from videos Video-MMMU 87.6% 83.6% 77.8% 80.4%
Competitive coding problems LiveCodeBench Pro Elo Rating, higher is better 2,439 1,775 1,418 2,243
Agentic terminal coding Terminal-Bench 2.0 Terminus-2 agent 54.2% 32.6% 42.8% 47.6%
Agentic coding SWE-Bench Verified Single attempt 76.2% 59.6% 77.2% 76.3%
Agentic tool use τ2-bench 85.4% 54.9% 84.7% 80.2%
Long-horizon agentic tasks Vending-Bench 2 Net worth (mean), higher is better $5,478.16 $573.64 $3,838.74 $1,473.43
Held out internal grounding, parametric, MM, and search retrieval benchmarks FACTS Benchmark Suite 70.5% 63.4% 50.4% 50.8%
Parametric knowledge SimpleQA Verified 72.1% 54.5% 29.3% 34.9%
Multilingual Q&A MMMLU 91.8% 89.5% 89.1% 91.0%
Commonsense reasoning across 100 Languages and Cultures Global PIQA 93.4% 91.5% 90.1% 90.9%
Long context performance MRCR v2 (8-needle) 128k (average) 77.0% 58.0% 47.1% 61.6%
1M (pointwise) 26.3% 16.4% not supported not supported

For details on our evaluation methodology please see deepmind.google/models/evals-methodology/gemini-3-pro


Model information

Name
3 Pro
Status
Preview
Input
  • Text
  • Image
  • Video
  • Audio
  • PDF
Output
  • Text
Input tokens
1M
Output tokens
64k
Knowledge cutoff
January 2025
Tool use
  • Function calling
  • Structured output
  • Search as a tool
  • Code execution
Best for
  • Agentic
  • Advanced coding
  • Long context understanding
  • Multimodal understanding
  • Algorithmic development
Availability
  • Gemini App
  • Google Cloud / Vertex AI
  • Google AI Studio
  • Gemini API
  • Google AI Mode
  • Google Antigravity
Documentation
View developer docs
Model card
View model card