Our latest Gemini 3 model that helps you bring any idea to life - faster.
Gemini 3 Flash is our most impressive model for agentic workflows.
Here are just a few ways you can use Gemini 3 Flash’s multimodal and frontier-level, real-time reasoning capabilities at speed.
See how Gemini 3 Flash outperforms Gemini 2.5 Pro on complex coding tasks. It generates richer, functional visualizations faster and with greater token efficiency.
Gemini 3 Flash can handle a large number of function calls reliably. Watch it reason across 100 ingredients and 100 tools simultaneously to successfully sequence complex tasks in near real-time.
In this slingshot game, Gemini 3 Flash delivers near real-time strategic guidance by simultaneously analyzing the video and hand-tracking inputs. It handles complex geometric calculations and velocity estimation to enable responsive live assistance.
Enable rapid iteration and A/B testing with low-latency code generation. Gemini 3 Flash evolves design elements, like this loading spinner, based on near real-time user feedback.
Automate tedious, multi-step processes by using Gemini 3 Flash to merge and clean messy data sources. 3 Flash’s multimodal and complex reasoning capabilities can transform unstructured data into organized databases.
Generate new UIs instantly with Gemini 3 Flash, explore multiple creative variations, and interact with 3 Flash in near real-time to have it come up with best UI outcomes, all with one click.
Leverage Gemini 3 Flash’s multimodal capabilities in visual recognition and reasoning to add contextual UI on image generations. 3 Flash has the capability to describe the content of the image in a compelling and interactive way.
Quickly build fun, useful apps from scratch using your voice without prior coding knowledge.
Upload an audio recording of yourself explaining a difficult concept and Gemini will identify knowledge gaps, create a custom quiz, and provide instant assessments and explanations for each question.
Run your small business with the help of Gemini 3 Flash and turn hours of work into minutes. Just upload your customer feedback and it’ll get to work analyzing the data, drafting the launch email, and coding a branded landing page.
Turn your videos into insights with Gemini 3 Flash. Simply upload a clip and ask Gemini, "How can I do this better?" It analyzes the visual details to give you a step-by-step breakdown. Perfect for sports, music, art, or any project where you need a second pair of eyes.
In our JetBrains AI Chat and Junie agentic-coding evaluation, Gemini 3 Flash delivered quality close to Gemini 3 Pro, while offering significantly lower inference latency and cost. In a quota-constrained production setup, it consistently stays within per-customer credit budgets, allowing complex multi-step agents to remain fast, predictable, and scalable.
Gemini 3 Flash is a great option for teams who want to quickly test and iterate on product ideas in Figma Make. The model can rapidly and reliably create prototypes while maintaining attention to detail and responding to specific design direction.
For the first time, Gemini 3 Flash combines speed and affordability with enough capability to power the core loop of a coding agent. We were impressed by its tool usage performance, as well as its strong design and coding skills.
Our engineers have found Gemini 3 Flash to work well together with Debug Mode in Cursor. Flash is fast and accurate at investigating issues and finding the root cause of bugs.
Gemini 3 Flash gives us a powerful new frontier model to fuel Workday’s AI-first strategy. From delivering sharper inference in our customer-facing applications to unlocking greater efficiency in our own operations and development, it provides the performance boost to continue to innovate rapidly.
Integrating Gemini 3 Flash into Agentforce is another step forward in our commitment to bring the best AI to our customers and deploy intelligent agents faster than ever. By pairing Google’s latest model capabilities with the power of Agentforce, we’re unlocking high-quality reasoning, stronger responses, and rapid iteration all inside the tools our customers already use.
At Bridgewater, we require models capable of reasoning over vast, unstructured multimodal datasets without sacrificing conceptual understanding. Gemini 3 Flash is the first to deliver Pro-class depth at the speed and scale our workflows demand. Its long-context performance on complex problems is exceptional.
Gemini 3 Flash is a major step above other models in its speed class when it comes to instruction following and intelligence. It's immediately become our go-to for latency-sensitive experiences in Devin, and we're excited to roll it out to more use cases.
Gemini 3 Flash shows a relative improvement of 15% in overall accuracy compared to Gemini 2.5 Flash, delivering breakthrough precision on our hardest extraction tasks like handwriting, long-form contracts, and complex financial data. This is a significant jump in performance, and we're excited to continue collaborating to bring this specialist-level reasoning to Box AI users.
Astrocade is using Gemini 3 Flash for our agentic game creation engine to power coding and planning. The speed of the 3 Flash model allows us to generate full game-level plans from a single prompt, but with decreased latency allowing us to deliver fast responses for our users.
Gemini 3 Flash has allowed Latitude to deliver high quality outputs at low costs for many complex tasks in our next generation AI game engine that was previously only possible from pro-level models like Sonnet 4.5.
Presentations.AI is using Gemini 3 Flash to enhance our intelligent slide-generation agents, and we’re consistently impressed by the pro-level quality at lightning-fast speeds. With previous pro sized models there were many things we simply couldn’t attempt because of the speed vs. quality tradeoff. With Gemini 3 Flash, we’re finally able to explore those workflows.
Resemble AI is using Gemini 3 Flash to transform DETECT-3B Omni’s detection outputs into actionable threat intelligence. Beyond providing explainability, Gemini 3 Flash can correlate detections with historical patterns, identify likely manipulation techniques, and flag whether similar content has appeared previously. This alongside Gemini 3 Flash’s speed advantage makes it seamless for users to query and ask for details into the deepfake detection results.
HubX uses Gemini 3 Flash to power our agentic book summarization and image prompt enrichment. The model’s speed enables real-time workflows that have increased our summarization efficiency by 20% and improved image editing response times by 50%—all while reducing costs.
Gemini 3 Flash has achieved a meaningful step up in reasoning, improving over 7% on Harvey’s BigLaw Bench from its predecessor, Gemini 2.5 Flash. These quality improvements, combined with Flash's low latency, are impactful for high-volume legal tasks such as extracting defined terms and cross-references from contracts.
Gemini 3 Flash remains the best fit for Warp’s Suggested Code Diffs, where low latency and cost efficiency are hard constraints. With this release, it resolves a broader set of common command-line errors while staying fast and economical. In our internal evaluations, we’ve seen an 8% lift in fix accuracy.
The improvements in the latest Gemini 3 Flash model are impressive. Even without specific optimization, we saw an immediate 10% baseline improvement on agentic coding tasks, including complex user-driven queries.
| Benchmark | Notes | Gemini 3 Flash Thinking | Gemini 3 Pro Thinking | Gemini 2.5 Flash Thinking | Gemini 2.5 Pro Thinking | Claude Sonnet 4.5 Thinking | GPT-5.2 Extra high | Grok 4.1 Fast Reasoning |
|---|---|---|---|---|---|---|---|---|
| Input price | $/1M tokens | $0.50 | $2.00 $4.00 > 200k tokens | $0.30 | $1.25 $2.50 > 200k tokens | $3.00 $6.00 /MTok > 200k tokens | $1.75 | $0.20 |
| Output price | $/1M tokens | $3.00 | $12.00 $18.00 > 200k tokens | $2.50 | $10.00 $15.00 > 200k tokens | $15.00 $22.50 > 200k tokens | $14.00 | $0.50 |
| Academic reasoning (full set, text + MM) Humanity's Last Exam | No tools | 33.7% | 37.5% | 11.0% | 21.6% | 13.7% | 34.5% | 17.6% |
| With search and code execution | 43.5% | 45.8% | — | — | — | 45.5% | — | |
| Visual reasoning puzzles ARC-AGI-2 | ARC Prize Verified | 33.6% | 31.1% | 2.5% | 4.9% | 13.6% | 52.9% | — |
| Scientific knowledge GPQA Diamond | No tools | 90.4% | 91.9% | 82.8% | 86.4% | 83.4% | 92.4% | 84.3% |
| Mathematics AIME 2025 | No tools | 95.2% | 95.0% | 72.0% | 88.0% | 87.0% | 100% | 91.9% |
| With code execution | 99.7% | 100% | 75.7% | — | 100% | — | — | |
| Multimodal understanding and reasoning MMMU-Pro | 81.2% | 81.0% | 66.7% | 68.0% | 68.0% | 79.5% | 63.0% | |
| Screen understanding ScreenSpot-Pro | No tools unless specified | 69.1% | 72.7% | 3.9% | 11.4% | 36.2% | 86.3% with python | — |
| Information synthesis from complex charts CharXiv Reasoning | No tools | 80.3% | 81.4% | 63.7% | 69.6% | 68.5% | 82.1% | — |
| OCR OmniDocBench 1.5 | Overall Edit Distance, lower is better | 0.121 | 0.115 | 0.154 | 0.145 | 0.145 | 0.143 | — |
| Knowledge acquisition from videos Video-MMMU | 86.9% | 87.6% | 79.2% | 83.6% | 77.8% | 85.9% | — | |
| Competitive coding problems from Codeforces, ICPC, and IOI LiveCodeBench Pro | Elo Rating, higher is better | 2316 | 2439 | 1143 | 1775 | 1418 | 2393 | — |
| Agentic terminal coding Terminal-Bench 2.0 | Terminus-2 harness | 47.6% | 54.2% | 16.9% | 32.6% | 42.8% | — | — |
| Agentic coding SWE-bench Verified | Single attempt | 78.0% | 76.2% | 60.4% | 59.6% | 77.2% | 80.0% | 50.6% |
| Agentic tool use τ2-bench | 90.2% | 90.7% | 79.5% | 77.8% | 87.2% | — | — | |
| Long horizon real-world software tasks Toolathlon | 49.4% | 36.4% | 3.7% | 10.5% | 38.9% | 46.3% | — | |
| Multi-step workflows using MCP MCP Atlas | 57.4% | 54.1% | 3.4% | 8.8% | 43.8% | 60.6% | — | |
| Agentic long term coherence Vending-Bench 2 | Net worth (mean), higher is better | $3,635 | $5,478 | $549 | $574 | $3,839 | $3,952 | $1,107 |
| Factuality benchmark across grounding, parametric, search, and MM FACTS Benchmark Suite | 61.9% | 70.5% | 50.4% | 63.4% | 48.9% | 61.4% | 42.1% | |
| Parametric knowledge SimpleQA Verified | 68.7% | 72.1% | 28.1% | 54.5% | 29.3% | 38.0% | 19.5% | |
| Multilingual Q&A MMMLU | 91.8% | 91.8% | 86.6% | 89.5% | 89.1% | 89.6% | 86.8% | |
| Commonsense reasoning across 100 Languages and Cultures Global PIQA | 92.8% | 93.4% | 90.2% | 91.5% | 90.1% | 91.2% | 85.6% | |
| Long context performance MRCR v2 (8-needle) | 128k (average) | 67.2% | 77.0% | 54.3% | 58.0% | 47.1% | 81.9% | 54.6% |
| 1M (pointwise) | 22.1% | 26.3% | 21.0% | 16.4% | not supported | not supported | 6.1% |
For details on our evaluation methodology please see deepmind.google/models/evals-methodology/gemini-3-flash