DataGemma
Fine-tuned Gemma 2 models that integrate retrieval techniques to ground responses in real-world data
The world’s first open models designed to help address the challenges of hallucination by grounding LLMs in the vast, real-world statistical data of Google's Data Commons.
Watch
Model series
-
Retrieval-Interleaved Generation (RIG)
This approach uses a variant of Gemma 2 that is fine-tuned to recognize when it needs to replace a generated number with more accurate information from Data Commons.
-
Retrieval-Augmented Generation (RAG)
This approach uses a variant of Gemma 2 that retrieves relevant information from Data Commons and then uses that information to create an extended prompt for the Gemini 1.5 Pro model.
Run or download RIG
Run or download RAG
Capabilities
-
Generate answers with real data
Explore and uncover verifiable insights by simply asking questions in plain language.
-
Evaluate AI data grounding techniques
Investigate ways to guide generative AI model output with retrieval-augmented and data-interleaved techniques.