Jump to Content

Gemini Nano

Gemini 2.0

Built for the agentic era

Unlock a new era of agentic experiences with our most capable AI model yet.

Introducing Gemini 2.0 Flash Experimental

Our workhorse model with low latency and enhanced performance. 2.0 Flash Experimental introduces improved capabilities like native tool use and for the first time, Gemini can also natively create images and generate speech.

Native image generation

Create or edit images and seamlessly blend them with text.

Native text-to-speech

Easily steer Gemini’s speaking style to match any mood.

Native tool use

Build agents that use Google Search, code execution and more.

One step closer to a universal AI assistant

Gemini 2.0 unlocks new possibilities for AI agents - intelligent systems that can use memory, reasoning, and planning to complete tasks for you. All under your supervision.

  • Taking action

    Agents can follow instructions and take helpful actions under your supervision.

  • Tool use

    Agents can search for information, look up reviews, translate and more.

  • Real-time streaming

    Agents respond seamlessly to live audio and video input.

Agents using multimodal understanding

A research prototype exploring future capabilities of a universal AI assistant.

Learn about Project Astra

Agents that can help you accomplish complex tasks

A research prototype exploring the future of human-agent interaction, starting with your browser.

Learn about Project Mariner

Agents in other domains

Agents for developers

A coding agent capable of fixing bugs, editing and validating code, and managing tasks under a developer’s supervision.

Learn about Jules

Gemini 2.0 for Games

Agents that can help you navigate the virtual world of video games.

Hands-on

Discover what’s possible with Gemini 2.0’s next-generation capabilities.

Download starter apps

Spatial understanding

Ask Gemini to give you the locations of objects, text, and more.

Launch applet

Video understanding

Outline key moments, or summarize with an overview in a paragraph, or even a haiku.

Launch applet

Function calling with Maps API

Ask questions based on geography, or choose a pre-populated topic to watch the map travel to different locations using Google Maps.

Launch applet

Multimodal Live API with 2.0 Flash Experimental

Our Multimodal Live API helps developers build applications with better natural language interactions and video understanding.

Learn more

Performance

Gemini 2.0 is our most capable model yet, building on the strengths of our previous generations.

Benchmarks

Enhanced capabilities against a wide range of benchmarks.

Capability Benchmark Description Gemini 1.5 Flash 002 Gemini 1.5 Pro 002 Gemini 2.0 Flash Experimental
General
MMLU-Pro Enhanced version of popular MMLU dataset with questions across multiple subjects with higher difficulty tasks
Enhanced version of popular MMLU dataset with questions across multiple subjects with higher difficulty tasks 67.3% 75.8% 76.4%
Code
Natural2Code Code generation across Python, Java, C++, JS, Go. Held out dataset HumanEval-like, not leaked on the web
Code generation across Python, Java, C++, JS, Go. Held out dataset HumanEval-like, not leaked on the web 79.8% 85.4% 92.9%
Code
Bird-SQL (Dev) Benchmark evaluating converting natural language questions into executable SQL
Benchmark evaluating converting natural language questions into executable SQL 45.6% 54.4% 56.9%
Code
LiveCodeBench Code generation in Python. Code Generation subset covering more recent examples: 06/01/2024 - 10/05/2024
Code generation in Python. Code Generation subset covering more recent examples: 06/01/2024 - 10/05/2024 30.0% 34.3% 35.1%
Factuality
FACTS Grounding Ability to provide factuality correct responses given documents and diverse user requests. Held out internal dataset
Ability to provide factuality correct responses given documents and diverse user requests. Held out internal dataset 82.9% 80.0% 83.6%
Math
MATH Challenging math problems (incl. algebra, geometry, pre-calculus, and others)
Challenging math problems (incl. algebra, geometry, pre-calculus, and others) 77.9% 86.5% 89.7%
Math
HiddenMath Competition-level math problems. Held out dataset AIME/AMC-like, crafted by experts and not leaked on the web
Competition-level math problems. Held out dataset AIME/AMC-like, crafted by experts and not leaked on the web 47.2% 52.0% 63.0%
Reasoning
GPQA (diamond) Challenging dataset of questions written by domain experts in biology, physics, and chemistry
Challenging dataset of questions written by domain experts in biology, physics, and chemistry 51.0% 59.1% 62.1%
Long-context
MRCR (1M) Novel, diagnostic long-context understanding evaluation
Novel, diagnostic long-context understanding evaluation 71.9% 82.6% 69.2%
Image
MMMU Multi-discipline college-level multimodal understanding and reasoning problems
Multi-discipline college-level multimodal understanding and reasoning problems 62.3% 65.9% 70.7%
Image
Vibe-Eval (Reka) Visual understanding in chat models with challenging everyday examples. Evaluated with a Gemini Flash model as a rater
Visual understanding in chat models with challenging everyday examples. Evaluated with a Gemini Flash model as a rater 48.9% 53.9% 56.3%
Audio
CoVoST2 (21 lang) Automatic speech translation (BLEU score)
Automatic speech translation (BLEU score) 37.4 40.1 39.2
Video
EgoSchema (test) Video analysis across multiple domains
Video analysis across multiple domains 66.8% 71.2% 71.5%

Building responsibly in the agentic era

As we develop these new technologies, we recognize the responsibility it entails, and aim to prioritize safety and security in all our efforts.

Learn more

For developers

Gemini’s improved capabilities mean it’s now possible to build new agents that can think, remember, plan and take action for you.

Start building

Developer showcase

Product explorations from developers experimenting with Gemini 2.0. Some sequences shortened.

Developer ecosystem

Build with cutting-edge generative AI models and tools to make AI helpful for everyone.

Gemini model family

Our versatile models run efficiently on everything from data centers to on-device.

Accessing our latest AI models

We want developers to gain access to our models as quickly as possible. We’re making these available through Google AI Studio.

Sign in to Google AI Studio

Build with the latest models from Google DeepMind

Get your API key and integrate powerful AI capabilities into your applications in less than 5 minutes.

Experimental

Gemini 2.0 Flash Experimental
Gemini Experimental 1121
Gemini Experimental 1206

Gemini Pro

Gemini 1.5 Pro

Gemini Flash

Gemini 1.5 Flash
Gemini 1.5 Flash-8B

Get the latest updates

Sign up for news on the latest innovations from Google DeepMind.

Gemini Nano