Gemini Audio

Live dialogue

Fluid and natural live dialogue and translation capabilities, for powerful voice-first applications

Models

Create agents capable of handling complex tasks and using tools, while engaging in natural-sounding conversations.



Live speech translation

Uses Gemini’s speech-to-speech translation capabilities to break down language barriers.

Broad language coverage

Delivers fluid speech-to-speech translation across 70+ languages and 2,000 language pairs, dissolving communication barriers in real-time.

Consistent intonation

Preserves the original speaker’s intonation, pacing and pitch, to convey not just what’s said, but how it’s said.

Multilingual input

Understands multiple languages in a single session, to help you follow multilingual conversations without changing any settings.

Automatic language detection

Identifies the language being spoken and starts translating – so you don’t need to figure it out yourself.

Noise robustness

Filters out ambient noise so you can hold conversations comfortably, even in loud outdoor environments.



Model information

Name
3.1 Flash Live
Status
Preview
Input
  • Text
  • Image
  • Video
  • Audio
Output
  • Text
  • Audio
Input tokens
128k
Output tokens
64k
Knowledge cutoff
January 2025
Availability
  • Gemini App
  • Google AI Studio
  • Gemini API
  • Google Antigravity
  • NotebookLM
Documentation
View developer docs
Model card
View model card

Try Live dialogue