Gemini Flash
Lightweight models, two variants, both optimized for speed and efficiency
Lightweight, fast and cost-efficient models featuring multimodal reasoning and a breakthrough long context window of up to one million tokens.
Small, smaller
Flash now comes in two compact variants, giving you the flexibility for whatever you choose to build.
Performance in a flash
Designed to be fast and efficient to serve at scale.
Built for speed
Sub-second average first-token latency for the vast majority of developer and enterprise use cases.
Quality at lower cost
On most common tasks, 1.5 Flash models achieve comparable quality to larger models, at a fraction of the cost.
Long-context understanding
Process hours of video and audio, and hundreds of thousands of words or lines of code.
Longer context
Flash models have a one-million-token context window by default, which means you can process one hour of video, 11 hours of audio, codebases with more than 30,000 lines of code, or over 700,000 words.
Relentless innovation
Our research team is continually exploring new ideas at the frontier of AI, building innovative products that show consistent progress on a range of benchmarks.
Capability |
Benchmark |
Description |
Gemini 1.5 Flash-8B (Oct 2024) |
Gemini 1.5 Flash (May 2024) |
Gemini 1.5 Flash (Sep 2024) |
Gemini 1.5 Pro (May 2024) |
Gemini 1.5 Pro (Sep 2024) |
---|---|---|---|---|---|---|---|
General MMLU-Pro Enhanced version of popular MMLU dataset with questions across multiple subjects with higher difficulty tasks |
|||||||
General |
MMLU-Pro |
Enhanced version of popular MMLU dataset with questions across multiple subjects with higher difficulty tasks |
Gemini 1.5 Flash-8B (Oct 2024) 58.7% |
Gemini 1.5 Flash (May 2024) 59.1% |
Gemini 1.5 Flash (Sep 2024) 67.3% |
Gemini 1.5 Pro (May 2024) 69.0% |
Gemini 1.5 Pro (Sept 2024) 75.8% |
Code Natural2Code Code generation across Python, Java, C++, JS, Go . Held out dataset HumanEval-like, not leaked on the web |
|||||||
Code |
Natural2Code |
Code generation across Python, Java, C++, JS, Go . Held out dataset HumanEval-like, not leaked on the web |
Gemini 1.5 Flash-8B (Oct 2024) 75.5% |
Gemini 1.5 Flash (May 2024) 77.2% |
Gemini 1.5 Flash (Sep 2024) 79.8% |
Gemini 1.5 Pro (May 2024) 82.6% |
Gemini 1.5 Pro (Sep 2024) 85.4% |
Math MATH Challenging math problems (incl. algebra, geometry, pre-calculus, and others) |
|||||||
Math |
MATH |
Challenging math problems (incl. algebra, geometry, pre-calculus, and others) |
Gemini 1.5 Flash-8B (Oct 2024) 58.7% |
Gemini 1.5 Flash (May 2024) 54.9% |
Gemini 1.5 Flash (Sep 2024) 77.9% |
Gemini 1.5 Pro (May 2024) 67.7% |
Gemini 1.5 Pro (Sep 2024) 86.5% |
HiddenMath Competition-level math problems, Held out dataset AIME/AMC-like, crafted by experts and not leaked on the web |
|||||||
HiddenMath |
Competition-level math problems, Held out dataset AIME/AMC-like, crafted by experts and not leaked on the web |
Gemini 1.5 Flash-8B (Oct 2024) 32.8% |
Gemini 1.5 Flash (May 2024) 20.3% |
Gemini 1.5 Flash (Sep 2024) 47.2% |
Gemini 1.5 Pro (May 2024) 28.0% |
Gemini 1.5 Pro (Sep 2024) 52.0% |
|
Reasoning GPQA (diamond) Challenging dataset of questions written by domain experts in biology, physics, and chemistry |
|||||||
Reasoning |
GPQA (diamond) |
Challenging dataset of questions written by domain experts in biology, physics, and chemistry |
Gemini 1.5 Flash-8B (Oct 2024) 38.4% |
Gemini 1.5 Flash (May 2024) 41.4% |
Gemini 1.5 Flash (Sep 2024) 51.0% |
Gemini 1.5 Pro (May 2024) 46.0% |
Gemini 1.5 Pro (Sep 2024) 59.1% |
Multilingual WMT23 Language translation |
|||||||
Multilingual |
WMT23 |
Language translation |
Gemini 1.5 Flash-8B (Oct 2024) 72.6 |
Gemini 1.5 Flash (May 2024) 74.1 |
Gemini 1.5 Flash (Sep 2024) 73.9 |
Gemini 1.5 Pro (May 2024) 75.3 |
Gemini 1.5 Pro (Sep 2024) 75.1 |
Long Context MRCR (1M) Diagnostic long-context understanding evaluation |
|||||||
Long Context |
MRCR (1M) |
Diagnostic long-context understanding evaluation |
Gemini 1.5 Flash-8B (Oct 2024) 54.7% |
Gemini 1.5 Flash (May 2024) 70.1% |
Gemini 1.5 Flash (Sep 2024) 71.9% |
Gemini 1.5 Pro (May 2024) 70.5% |
Gemini 1.5 Pro (Sep 2024) 82.6% |
Image MMMU Multi-discipline college-level multimodal reasoning problems |
|||||||
Image |
MMMU |
Multi-discipline college-level multimodal understanding and reasoning problems |
Gemini 1.5 Flash-8B (Oct 2024) 53.7% |
Gemini 1.5 Flash (May 2024) 56.1% |
Gemini 1.5 Flash (Sep 2024) 62.3% |
Gemini 1.5 Pro (May 2024) 62.2% |
Gemini 1.5 Pro (Sep 2024) 65.9% |
Vibe-Eval (Reka) Visual understanding in chat models with challenging everyday examples. Evaluated with a Gemini Flash model as a rater |
|||||||
Vibe-Eval (Reka) |
Visual understanding in chat models with challenging everyday examples. Evaluated with a Gemini Flash model as a rater |
Gemini 1.5 Flash-8B (Oct 2024) 40.9% |
Gemini 1.5 Flash (May 2024) 44.8% |
Gemini 1.5 Flash (Sep 2024) 48.9% |
Gemini 1.5 Pro (May 2024) 48.9% |
Gemini 1.5 Pro (Sep 2024) 53.9% |
|
MathVista Mathematical reasoning in visual contexts |
|||||||
MathVista |
Mathematical reasoning in visual contexts |
Gemini 1.5 Flash-8B (Oct 2024) 54.7% |
Gemini 1.5 Flash (May 2024) 58.4% |
Gemini 1.5 Flash (Sep 2024) 65.8% |
Gemini 1.5 Pro (May 2024) 63.9% |
Gemini 1.5 Pro (Sep 2024) 68.1% |
|
Audio FLEURS (55 lang) Automatic speech recognition (based on word error rate, lower is better) |
|||||||
Audio |
FLEURS (55 lang) |
Automatic speech recognition (based on word error rate, lower is better) |
Gemini 1.5 Flash-8B (Oct 2024) 13.6% |
Gemini 1.5 Flash (May 2024) 9.8% |
Gemini 1.5 Flash (Sep 2024) 9.6% |
Gemini 1.5 Pro (May 2024) 6.5% |
Gemini 1.5 Pro (May 2024) 6.7% |
Video Video-MME Video analysis across multiple domains |
|||||||
Video |
Video-MME |
Video analysis across multiple domains |
Gemini 1.5 Flash-8B (Oct 2024) 66.2% |
Gemini 1.5 Flash (May 2024) 74.7% |
Gemini 1.5 Flash (Sep 2024) 76.1% |
Gemini 1.5 Pro (May 2024) 77.9% |
Gemini 1.5 Pro (May 2024) 78.6% |
Safety XSTest Measures how often models refuse to respond to safe/benign prompts. The score represents how frequently models correctly fulfill requests |
|||||||
Safety |
XSTest |
Measures how often models refuse to respond to safe/benign prompts. The score represents how frequently models correctly fulfill requests |
Gemini 1.5 Flash-8B (Oct 2024) 92.6% |
Gemini 1.5 Flash (May 2024) 86.9% |
Gemini 1.5 Flash (Sep 2024) 97.0% |
Gemini 1.5 Pro (May 2024) 88.4% |
Gemini 1.5 Pro (May 2024) 98.8% |
Research
Technical reports
For developers
Build with Gemini
Integrate Gemini models into your applications with Google AI Studio and Google Cloud Vertex AI.