Jump to Content

The most general and capable AI models we've ever built.

Our most flexible models yet

Each Gemini model is built for its own set of use cases, making a versatile model family that runs efficiently on everything from data centers to on-device.

Project Astra

Project Astra explores the future of AI assistants. Building on our Gemini models, we’ve developed AI agents that can quickly process multimodal information, reason about the context you’re in, and respond to questions at a conversational pace, making interactions feel much more natural.

The demo shows two continuous takes: one with the prototype running on a Google Pixel phone and another on a prototype glasses device.

Natively multimodal

Gemini models are built from the ground up for multimodality, seamlessly combining and understanding text, code, images, audio, and video.

Following content is a visual/ descriptive representation of the functionality of Gemini:

Gemini models can generate code based on different kinds of inputs.

Gemini models can generate code based on different kinds of inputs.

Gemini

I see a murmuration of starlings, so I coded a flocking simulation.

Gemini models can generate text and images, combined.

Could Gemini show me ideas for what to make?

Gemini

How about an octopus with blue and pink tentacles?

Gemini models can understand and perform tasks involving several different written languages.

Could Gemini explain what this means?

Gemini

I see the time signature is 6/8. This means there are 6 eighth notes in each measure.

The dynamic marking is piano, which means to play softly. Andante grazioso means to play at a graceful walking pace.

Longer context

1.5 Pro and 1.5 Flash both have a default context window of up to one million tokens — the longest context window of any large scale foundation model. They achieve near-perfect recall on long-context retrieval tasks across modalities, unlocking the ability to process long documents, thousands of lines of code, hours of audio, video, and more. For 1.5 Pro, developers and enterprise customers can also sign up to try a two-million-token context window.

Research

Relentless innovation

Our research team is continually exploring new ideas at the frontier of AI, building innovative products that show consistent progress on a range of benchmarks.

Capability

Benchmark

Description

Gemini 1.5 Flash

(May 2024)

Gemini 1.5 Flash

(Sep 2024)

Gemini 1.5 Pro

(May 2024)

Gemini 1.5 Pro

(Sep 2024)

General

MMLU-Pro

Enhanced version of popular MMLU dataset with questions across multiple subjects with higher difficulty tasks

General

MMLU-Pro

Enhanced version of popular MMLU dataset with questions in 57 subjects (incl. STEM, humanities, and others) with higher difficulty tasks

Gemini 1.5 Flash

(May 2024)

59.1%

Gemini 1.5 Flash

(Sep 2024)

67.3%

Gemini 1.5 Pro

(May 2024)

69.0%

Gemini 1.5 Pro

(May 2024)

75.8%

Code

Natural2Code

Code generation across Python, Java, C++, JS, Go . Held out dataset HumanEval-like, not leaked on the web

Code

Natural2Code

Code generation across Python, Java, C++, JS, Go . Held out dataset HumanEval-like, not leaked on the web

Gemini 1.5 Flash

(May 2004)

77.2%

Gemini 1.5 Flash

(Sep 2024)

79.8%

Gemini 1.5 Pro

(May 2024)

82.6%

Gemini 1.5 Pro

(Sep 2024)

85.4%

Math

MATH

Challenging math problems (incl. algebra, geometry, pre-calculus, and others)

Math

MATH

Challenging math problems (incl. algebra, geometry, pre-calculus, and others)

Gemini 1.5 Flash

(May 2004)

54.9%

Gemini 1.5 Flash

(Sep 2024)

77.9%

Gemini 1.5 Pro

(May 2024)

67.7%

Gemini 1.5 Pro

86.5%

HiddenMath

Competition-level math problems, Held out dataset AIME/AMC-like, crafted by experts and not leaked on the web

HiddenMath

Competition-level math problems, Held out dataset AIME/AMC-like, crafted by experts and not leaked on the web

Gemini 1.5 Flash

(May 2004)

20.3%

Gemini 1.5 Flash

(Sep 2024)

47.2%

Gemini 1.5 Pro

(May 2024)

28.0%

Gemini 1.5 Pro

52.0%

Reasoning

GPQA (diamond)

Challenging dataset of questions written by domain experts in biology, physics, and chemistry

Reasoning

GPQA (diamond)

Challenging dataset of questions written by domain experts in biology, physics, and chemistry

Gemini 1.5 Flash

(May 2024)

41.4%

Gemini 1.5 Flash

(Sep 2024)

51.0%

Gemini 1.5 Pro

(May 2024)

46.0%

Gemini 1.5 Pro

(Sep 2024)

59.1%

Multilingual

WMT23

Language translation

Multilingual

WMT23

Language translation

Gemini 1.5 Flash

(May 2024)

74.1

Gemini 1.5 Flash

(Sep 2024)

73.9

Gemini 1.5 Pro

(May 2024)

75.3

Gemini 1.5 Pro

(Sep 2024)

75.1

Long Context

RULER (at 1M)

Diagnostic suite checking long-context ability of the models over a range of tasks

Long Context

RULER (at 1M)

Diagnostic suite checking long-context ability of the models over a range of tasks

Gemini 1.5 Flash

(May 2024)

69.6%

Gemini 1.5 Flash

(Sep 2024)

82.3%

Gemini 1.5 Pro

(May 2024)

40.1%

Gemini 1.5 Pro

(Sep 2024)

86.4%

MRCR (1M)

Diagnostic long-context understanding evaluation

MRCR (1M)

Diagnostic long-context understanding evaluation

Gemini 1.5 Flash

(May 2024)

70.1%

Gemini 1.5 Flash

(Sep 2024)

71.9%

Gemini 1.5 Pro

(May 2024)

70.5%

Gemini 1.5 Pro

(Sep 2024)

82.6%

Image

MMMU

Multi-discipline college-level reasoning problems

Image

MMMU

Multi-discipline college-level reasoning problems

Gemini 1.5 Flash

(May 2024)

56.1%

Gemini 1.5 Flash

(Sep 2024)

62.3%

Gemini 1.5 Pro

(May 2024)

62.2%

Gemini 1.5 Pro

(Sep 2024)

65.9%

Vibe-Eval (Reka)

Visual understanding in chat models with challenging everyday examples. Evaluated with a Gemini Flash model as a rater

Vibe-Eval (Reka)

Visual understanding in chat models with challenging everyday examples. Evaluated with a Gemini Flash model as a rater

Gemini 1.5 Flash

(May 2024)

44.8%

Gemini 1.5 Flash

(Sep 2024)

48.9%

Gemini 1.5 Pro

(May 2024)

48.9%

Gemini 1.5 Pro

(Sep 2024)

53.9%

Image

MathVista

Mathematical reasoning in visual contexts

MathVista

Mathematical reasoning in visual contexts

Gemini 1.5 Flash

(May 2024)

58.4%

Gemini 1.5 Flash

(Sep 2024)

65.8%

Gemini 1.5 Pro

(May 2024)

63.9%

Gemini 1.5 Pro

(Sep 2024)

68.1%

Audio

FLEURS (55 languages)

Automatic speech recognition (based on word error rate, lower is better)

Audio

FLEURS (55 languages)

Automatic speech recognition (based on word error rate, lower is better)

Gemini 1.5 Flash

(May 2024)

9.8%

Gemini 1.5 Flash

(Sep 2024)

9.6%

Gemini 1.5 Pro

(May 2024)

6.5%

Gemini 1.5 Pro

(May 2024)

6.7%

Video

Video-MME

Video analysis across multiple domains

Video

Video-MME

Video analysis across multiple domains

Gemini 1.5 Flash

(May 2024)

74.7%

Gemini 1.5 Flash

(Sep 2024)

76.1%

Gemini 1.5 Pro

(May 2024)

77.9%

Gemini 1.5 Pro

(May 2024)

78.6%

Safety

XSTest

Measures how often models refuse to respond to safe/benign prompts. The score represents how frequently models correctly fulfill requests

Safety

XSTest

Measures how often models refuse to respond to safe/benign prompts. The score represents how frequently models correctly fulfill requests

Gemini 1.5 Flash

(May 2024)

86.9%

Gemini 1.5 Flash

(Sep 2024)

97.0%

Gemini 1.5 Pro

(May 2024)

88.4%

Gemini 1.5 Pro

(May 2024)

98.8%

Technical reports

For developers

Build with Gemini

Integrate Gemini models into your applications with Google AI Studio and Google Cloud Vertex AI.

Try the models

Get started

Example prompts for the Gemini API in Google AI Studio.

Responsibility at the core

Our models undergo extensive ethics and safety tests, including adversarial testing for bias and toxicity.

Hands-on

Serving billions of Google users

Gemini models are embedded in a range of Google experiences.

What's new

Get the latest updates

Sign up for news on the latest innovations from Google DeepMind.