Gemini 3.1 Flash-Lite

Best for high-volume tasks that need efficiency and intelligence

Introducing 3.1 Flash-Lite, a scalable thinking model for high-volume tasks at low cost and latency.



Performance

3.1 Flash-Lite performs significantly better than 2.5 Flash across a number of key benchmarks, including general quality, reasoning, translation and factuality.

Benchmark Notes Gemini 3.1
Flash-Lite High
Gemini 2.5
Flash Dynamic
Gemini 2.5
Flash-Lite Dynamic
GPT-5
mini High
Claude 4.5
Haiku Extended Thinking
Grok 4.1
Fast Reasoning
Input price $/1M tokens, no caching Lower is better $0.25 $0.30 $0.10 $0.25 $1.00 $0.20
Output price $/1M tokens Lower is better $1.50 $2.50 $0.40 $2.00 $5.00 $0.50
Output speed Tokens / s 363 249 366 71 108 145
Humanity’s Last Exam Academic reasoning (full set, text + MM) No tools 16.0% 11.0% 6.9% 16.7% 9.7% 17.6%
GPQA Diamond Scientific knowledge No tools 86.9% 82.8% 66.7% 82.3% 73.0% 84.3%
MMMU-Pro Multimodal understanding and reasoning No tools 76.8% 66.7% 51.0% 74.1% 58.0% 63.0%
CharXiv Reasoning Information synthesis from complex charts 73.2% 63.7% 55.5% 75.5% (+ python) 61.7% 31.6%
Video-MMMU Knowledge acquisition from videos 84.8% 79.2% 60.7% 82.5% 74.6%
SimpleQA Verified Parametric knowledge 43.3% 28.1% 11.5% 9.5% 5.5% 19.5%
FACTS Benchmark Suite Factuality benchmark across grounding, parametric, search, and MM. 40.6% 50.4% 17.9% 33.7% 18.6% 42.1%
MMMLU Multilingual Q&A 88.9% 86.6% 84.5% 84.9% 83.0% 86.8%
LiveCodeBench Code generation (UI: 1/1/2025-5/1/2025) 72.0% 62.6% 34.3% 80.4% 53.2% 76.5%
MRCR v2 (8-needle) Long context performance 128k (average) 60.1% 54.3% 30.6% 52.5% 35.3% 54.6%
1M (pointwise) 12.3% 21.0% 5.4% Not supported Not supported 6.1%

Model information

Name
3.1 Flash-Lite
Status
Preview
Input
  • Text
  • Image
  • Video
  • Audio
  • PDF
Output
  • Text
Input tokens
1M
Output tokens
64k
Knowledge cutoff
January 2025
Tool use
  • Function calling
  • Structured output
  • Search as a tool
  • Code execution
Best for
  • High-volume, latency-sensitive reasoning tasks
Availability
  • Google AI Studio
  • Gemini API
  • Vertex AI
Documentation
View developer docs
Model card
View model card