Gemini 2.0

Built for the agentic era

Unlock a new era of agentic experiences with our most capable AI model yet.

Introducing Gemini 2.0 Flash Experimental

Our workhorse model with low latency and enhanced performance. 2.0 Flash Experimental introduces improved capabilities like native tool use and for the first time, Gemini can also natively create images and generate speech.

Native image generation

Create or edit images and seamlessly blend them with text.

Native text-to-speech

Easily steer Gemini’s speaking style to match any mood.

Native tool use

Build agents that use Google Search, code execution and more.

One step closer to a universal AI assistant

Gemini 2.0 unlocks new possibilities for AI agents - intelligent systems that can use memory, reasoning, and planning to complete tasks for you. All under your supervision.

Taking action
Agents can follow instructions and take helpful actions under your supervision.
Tool use
Agents can search for information, look up reviews, translate and more.
Real-time streaming
Agents respond seamlessly to live audio and video input.

Agents using multimodal understanding

A research prototype exploring future capabilities of a universal AI assistant.

Learn about Project Astra

Agents that can help you accomplish complex tasks

A research prototype exploring the future of human-agent interaction, starting with your browser.

Learn about Project Mariner

Agents in other domains

Agents for developers

A coding agent capable of fixing bugs, editing and validating code, and managing tasks under a developer’s supervision.

Learn about Jules

Gemini 2.0 for Games

Agents that can help you navigate the virtual world of video games.

Hands-on

Discover what’s possible with Gemini 2.0’s next-generation capabilities.

Download starter apps

Spatial understanding

Ask Gemini to give you the locations of objects, text, and more.

Launch applet

Video understanding

Outline key moments, or summarize with an overview in a paragraph, or even a haiku.

Launch applet

Function calling with Maps API

Ask questions based on geography, or choose a pre-populated topic to watch the map travel to different locations using Google Maps.

Launch applet

Multimodal Live API with 2.0 Flash Experimental

Our Multimodal Live API helps developers build applications with better natural language interactions and video understanding.

Learn more

Boilerplate

A react-based starter project to simplify the development of real-time conversational applications with Gemini.

Download

GenExplainer

Learn about complex topics with unexpected characters. Understand the science of volcanoes with a sports commentator, or the complexities of mortgages with a pirate. Talk back and forth with the characters.

Download

GenWeather

Learn more about the weather in a particular location. Ask a question to receive a text explainer on what the weather looks like, in the style of a character of your choice.

Download

Performance

Gemini 2.0 is our most capable model yet, building on the strengths of our previous generations.

Benchmarks

Enhanced capabilities against a wide range of benchmarks.

Capability	Benchmark	Description	Gemini 1.5 Flash 002	Gemini 1.5 Pro 002	Gemini 2.0 Flash Experimental
General	MMLU-Pro Enhanced version of popular MMLU dataset with questions across multiple subjects with higher difficulty tasks	Enhanced version of popular MMLU dataset with questions across multiple subjects with higher difficulty tasks	67.3%	75.8%	76.4%
Code	Natural2Code Code generation across Python, Java, C++, JS, Go. Held out dataset HumanEval-like, not leaked on the web	Code generation across Python, Java, C++, JS, Go. Held out dataset HumanEval-like, not leaked on the web	79.8%	85.4%	92.9%
Code	Bird-SQL (Dev) Benchmark evaluating converting natural language questions into executable SQL	Benchmark evaluating converting natural language questions into executable SQL	45.6%	54.4%	56.9%
Code	LiveCodeBench Code generation in Python. Code Generation subset covering more recent examples: 06/01/2024 - 10/05/2024	Code generation in Python. Code Generation subset covering more recent examples: 06/01/2024 - 10/05/2024	30.0%	34.3%	35.1%
Factuality	FACTS Grounding Ability to provide factuality correct responses given documents and diverse user requests. Held out internal dataset	Ability to provide factuality correct responses given documents and diverse user requests. Held out internal dataset	82.9%	80.0%	83.6%
Math	MATH Challenging math problems (incl. algebra, geometry, pre-calculus, and others)	Challenging math problems (incl. algebra, geometry, pre-calculus, and others)	77.9%	86.5%	89.7%
Math	HiddenMath Competition-level math problems. Held out dataset AIME/AMC-like, crafted by experts and not leaked on the web	Competition-level math problems. Held out dataset AIME/AMC-like, crafted by experts and not leaked on the web	47.2%	52.0%	63.0%
Reasoning	GPQA (diamond) Challenging dataset of questions written by domain experts in biology, physics, and chemistry	Challenging dataset of questions written by domain experts in biology, physics, and chemistry	51.0%	59.1%	62.1%
Long-context	MRCR (1M) Novel, diagnostic long-context understanding evaluation	Novel, diagnostic long-context understanding evaluation	71.9%	82.6%	69.2%
Image	MMMU Multi-discipline college-level multimodal understanding and reasoning problems	Multi-discipline college-level multimodal understanding and reasoning problems	62.3%	65.9%	70.7%
Image	Vibe-Eval (Reka) Visual understanding in chat models with challenging everyday examples. Evaluated with a Gemini Flash model as a rater	Visual understanding in chat models with challenging everyday examples. Evaluated with a Gemini Flash model as a rater	48.9%	53.9%	56.3%
Audio	CoVoST2 (21 lang) Automatic speech translation (BLEU score)	Automatic speech translation (BLEU score)	37.4	40.1	39.2
Video	EgoSchema (test) Video analysis across multiple domains	Video analysis across multiple domains	66.8%	71.2%	71.5%