Gemini 3.1 Flash-Lite

Best for high-volume tasks that need efficiency and intelligence

Introducing 3.1 Flash-Lite, a scalable thinking model for high-volume tasks at low cost and latency.



Performance

3.1 Flash-Lite performs significantly better than 2.5 Flash across a number of key benchmarks, including general quality, reasoning, translation and factuality.

BenchmarkNotesGemini 3.1
Flash-Lite High
Gemini 2.5
Flash Dynamic
Gemini 2.5
Flash-Lite Dynamic
GPT-5
mini High
Claude 4.5
Haiku Extended Thinking
Grok 4.1
Fast Reasoning
Input price $/1M tokens, no cachingLower is better$0.25$0.30$0.10$0.25$1.00$0.20
Output price $/1M tokensLower is better$1.50$2.50$0.40$2.00$5.00$0.50
Output speed Tokens / s36324936671108145
Humanity’s Last Exam Academic reasoning (full set, text + MM)No tools16.0%11.0%6.9%16.7%9.7%17.6%
GPQA Diamond Scientific knowledgeNo tools86.9%82.8%66.7%82.3%73.0%84.3%
MMMU-Pro Multimodal understanding and reasoningNo tools76.8%66.7%51.0%74.1%58.0%63.0%
CharXiv Reasoning Information synthesis from complex charts73.2%63.7%55.5%75.5% (+ python)61.7%31.6%
Video-MMMU Knowledge acquisition from videos84.8%79.2%60.7%82.5%74.6%
SimpleQA Verified Parametric knowledge43.3%28.1%11.5%9.5%5.5%19.5%
FACTS Benchmark Suite Factuality benchmark across grounding, parametric, search, and MM.40.6%50.4%17.9%33.7%18.6%42.1%
MMMLU Multilingual Q&A88.9%86.6%84.5%84.9%83.0%86.8%
LiveCodeBench Code generation (UI: 1/1/2025-5/1/2025)72.0%62.6%34.3%80.4%53.2%76.5%
MRCR v2 (8-needle) Long context performance128k (average)60.1%54.3%30.6%52.5%35.3%54.6%
1M (pointwise)12.3%21.0%5.4%Not supportedNot supported6.1%

Model information

Name
3.1 Flash-Lite
Status
Preview
Input
  • Text
  • Image
  • Video
  • Audio
  • PDF
Output
  • Text
Input tokens
1M
Output tokens
64k
Knowledge cutoff
January 2025
Tool use
  • Function calling
  • Structured output
  • Search as a tool
  • Code execution
Best for
  • High-volume, latency-sensitive reasoning tasks
Availability
  • Google AI Studio
  • Gemini API
  • Vertex AI
Documentation
View developer docs
Model card
View model card