Gemini Audio

Live dialogue

Fluid and natural live dialogue and translation capabilities, for powerful voice-first applications

Models

Create agents capable of handling complex tasks and using tools, while engaging in natural-sounding conversations.



Live speech translation

Uses Gemini’s speech-to-speech translation capabilities to break down language barriers.

Broad language coverage

Delivers fluid speech-to-speech translation across 70+ languages and 2,000 language pairs.

Consistent intonation

Preserves the speaker’s original intonation, pacing and pitch to capture not just what they said, but how they said it.

Multilingual input

Translates multiple languages in a single session – no need to change the settings.

Automatic language detection

Identifies the language being spoken and begins translation, without being told what it is.

Noise robustness

Filters out ambient noise so audio stays crisp and clear, even in loud outdoor environments.

Real-time latency

Minimizes processing lag to translate speech instantly and eliminate awkward pauses – keeping conversations flowing naturally.



Model information

Name3.1 Flash Live3.5 Live Translate
StatusPreviewPreview
Input
  • Text
  • Image
  • Video
  • Audio
  • Audio
Output
  • Text
  • Audio
  • Text
  • Audio
Input tokens128k128k
Output tokens64k64k
Knowledge cutoffJanuary 2025January 2025
Availability
  • Gemini App
  • Google AI Studio
  • Gemini API
  • Google Antigravity
  • NotebookLM
  • Google Translate
  • Google AI Studio
  • Gemini API
DocumentationView developer docsView developer docs
Model cardView model cardView model card

Try Live dialogue