Gemini Models
The most general and capable AI models we've ever built.
Our most flexible models yet
Each Gemini model is built for its own set of use cases, making a versatile model family that runs efficiently on everything from data centers to on-device.
Project Astra
Project Astra explores the future of AI assistants. Building on our Gemini models, we’ve developed AI agents that can quickly process multimodal information, reason about the context you’re in, and respond to questions at a conversational pace, making interactions feel much more natural.
The demo shows two continuous takes: one with the prototype running on a Google Pixel phone and another on a prototype glasses device.
Natively multimodal
Gemini models are built from the ground up for multimodality, seamlessly combining and understanding text, code, images, audio, and video.
Longer context
1.5 Pro and 1.5 Flash both have a default context window of up to one million tokens — the longest context window of any large scale foundation model. They achieve near-perfect recall on long-context retrieval tasks across modalities, unlocking the ability to process long documents, thousands of lines of code, hours of audio, video, and more. For 1.5 Pro, developers and enterprise customers can also sign up to try a two-million-token context window.
Research
Relentless innovation
Our research team is continually exploring new ideas at the frontier of AI, building innovative products that show consistent progress on a range of benchmarks. Our newest model is Gemini 1.5 Flash.
Capability |
Benchmark |
Description |
Gemini 1.0 Pro |
Gemini 1.0 Ultra |
Gemini 1.5 Pro (Feb 2024) |
Gemini 1.5 Flash |
Gemini 1.5 Pro (May 2024) |
---|---|---|---|---|---|---|---|
General MMLU Representation of questions in 57 subjects (incl. STEM, humanities, and others) |
|||||||
General |
MMLU |
Representation of questions in 57 subjects (incl. STEM, humanities, and others) |
Gemini 1.0 Pro 71.8% |
Gemini 1.0 Ultra 83.7% |
Gemini 1.5 Pro (Feb 2024) 81.9% |
Gemini 1.5 Flash 78.9% |
Gemini 1.5 Pro (May 2024) 85.9% |
Code Natural2Code Python code generation. Held out dataset HumanEval-like, not leaked on the web |
|||||||
Code |
Natural2Code |
Python code generation. Held out dataset HumanEval-like, not leaked on the web |
Gemini 1.0 Pro 69.6% |
Gemini 1.0 Ultra 74.9% |
Gemini 1.5 Pro (Feb 2024) 77.7% |
Gemini 1.5 Flash 77.2% |
Gemini 1.5 Pro (May 2024) 82.6% |
Math MATH Challenging math problems (incl. algebra, geometry, pre-calculus, and others) |
|||||||
Math |
MATH |
Challenging math problems (incl. algebra, geometry, pre-calculus, and others) |
Gemini 1.0 Pro 32.6% |
Gemini 1.0 Ultra 53.2% |
Gemini 1.5 Pro (Feb 2024) 58.5% |
Gemini 1.5 Flash 54.9% |
Gemini 1.5 Pro (May 2024) 67.7% |
Reasoning GPQA (main) Challenging dataset of questions written by domain experts in biology, physics, and chemistry |
|||||||
Reasoning |
GPQA (main) |
Challenging dataset of questions written by domain experts in biology, physics, and chemistry |
Gemini 1.0 Pro 27.9% |
Gemini 1.0 Ultra 35.7% |
Gemini 1.5 Pro (Feb 2024) 41.5% |
Gemini 1.5 Flash 39.5% |
Gemini 1.5 Pro (May 2024) 46.2% |
Reasoning Big-Bench Hard Diverse set of challenging tasks requiring multi-step reasoning |
|||||||
Big-Bench Hard |
Diverse set of challenging tasks requiring multi-step reasoning |
Gemini 1.0 Pro 75.0% |
Gemini 1.0 Ultra 83.6% |
Gemini 1.5 Pro (Feb 2024) 84.0% |
Gemini 1.5 Flash 85.5% |
Gemini 1.5 Pro (May 2024) 89.2% |
|
Multilingual WMT23 Language translation |
|||||||
Multilingual |
WMT23 |
Language translation |
Gemini 1.0 Pro 71.7 |
Gemini 1.0 Ultra 74.4 |
Gemini 1.5 Pro (Feb 2024) 75.2 |
Gemini 1.5 Flash 74.1 |
Gemini 1.5 Pro (May 2024) 75.3 |
Image MMMU Multi-discipline college-level reasoning problems |
|||||||
Image |
MMMU |
Multi-discipline college-level reasoning problems |
Gemini 1.0 Pro 47.9% |
Gemini 1.0 Ultra 59.4% |
Gemini 1.5 Pro (Feb 2024) 58.5% |
Gemini 1.5 Flash 56.1% |
Gemini 1.5 Pro (May 2024) 62.2% |
Image MathVista Multi-discipline college-level reasoning problems |
|||||||
MathVista |
Mathematical reasoning in visual contexts |
Gemini 1.0 Pro 46.6% |
Gemini 1.0 Ultra 53.0% |
Gemini 1.5 Pro (Feb 2024) 54.7% |
Gemini 1.5 Flash 58.4% |
Gemini 1.5 Pro (May 2024) 63.9% |
|
Audio FLEURS (55 languages) Automatic speech recognition (based on word error rate, lower is better) |
|||||||
Audio |
FLEURS (55 languages) |
Automatic speech recognition (based on word error rate, lower is better) |
Gemini 1.0 Pro 6.4% |
Gemini 1.0 Ultra 6.0% |
Gemini 1.5 Pro (Feb 2024) 6.6% |
Gemini 1.5 Flash 9.8% |
Gemini 1.5 Pro (May 2024) 6.5% |
Video EgoSchema Video question answering |
|||||||
Video |
EgoSchema |
Video question answering |
Gemini 1.0 Pro 55.7% |
Gemini 1.0 Ultra 61.5% |
Gemini 1.5 Pro (Feb 2024) 65.1% |
Gemini 1.5 Flash 65.7% |
Gemini 1.5 Pro (May 2024) 72.2% |
Technical reports
For developers
Build with Gemini
Integrate Gemini models into your applications with Google AI Studio and Google Cloud Vertex AI.
Try the models
Get started
Example prompts for the Gemini API in Google AI Studio.
Responsibility at the core
Our models undergo extensive ethics and safety tests, including adversarial testing for bias and toxicity.
Hands-on
Serving billions of Google users
Gemini models are embedded in a range of Google experiences.
What's new
-
Technologies
Gemini breaks new ground: a faster model, longer context and AI agents
We’re introducing a series of updates across the Gemini family of models, including the new 1.5 Flash, our lightweight model for speed and efficiency, and Project Astra, our vision for the future...
-
Technologies
Our next-generation model: Gemini 1.5
The model delivers dramatically enhanced performance, with a breakthrough in long-context understanding across modalities.
-
Technologies
The next chapter of our Gemini era
We're bringing Gemini to more Google products