Gemma 3 is at the core of Lettria’s new text-to-graph model that sets a new standard for enterprise-grade reasoning and reliability.
Lettria is a Paris-based startup composed of 15 passionate data scientists and software engineers working to improve the performance and reliability of AI agents through training with knowledge graphs.
The team at Lettria fine-tuned Gemma 3 to create its Perseus model for superior knowledge graph creation, enabling it to create and populate graph databases from unstructured text that can then be used by graph-based AI agents.
The team initially worked with Gemini 2.5 Pro for knowledge graph creation—and while powerful, it presented several key obstacles for Lettria’s particular use case. Julien Plu, Research Scientist at Lettria, explained, “A primary issue with Gemini was the model's inconsistent adherence to our specified ontologies and output schemas. This frequently led to unusable outputs and a subsequent loss of information in the final knowledge graph.”
The team had a few requirements that cloud-based models couldn’t satisfy:
The team decided to look into open, lightweight models as an alternative. When tested alongside other models like Qwen and Mistral, Gemma proved to be the ideal solution for Lettria for several reasons. Training felt more straightforward and efficient to the team, according to Borderie, and Gemma’s multimodal performance was “a significant factor” in their decision. Gemma also offered better multilingual capabilities, future-proofing Lettria in the event they decide to expand beyond English.
Our fine-tuned Gemma models outperform all the very large closed-source models for graph building, Gemini included.
To create Lettria Perseus, Gemma 3 4B, 12B, and 27B all underwent the same fine-tuning process using the Transformers Python library to ensure compatibility with a wide array of training frameworks, inference engines, and modeling libraries. With the use of Transformers, inference can be executed on any cloud provider offering GPU capabilities, and is accelerated by the vLLM framework.
The model was trained on a specialized text-to-graph dataset divided into a training set of 9,500 examples and a test set of 2,000 examples, drawn from 19 distinct ontologies. In total, the dataset contains 24,000 triples, with each example’s entities and relations annotated by their corresponding ontology. Training with this dataset helped eliminate data loss, ensure consistency, and improve factual accuracy.
The team provided the specific parameters used to train the model:
peft_config = LoraConfig(
r=128,
lora_alpha=512
)
training_args = TrainingArguments(
per_device_train_batch_size=1,
gradient_accumulation_steps=8,
num_train_epochs=3,
warmup_steps=5,
learning_rate=2e-5,
lr_scheduler_type="linear",
optim="adamw_8bit",
weight_decay=0.01
)
To evaluate the efficacy of their training, the team developed a reliability scoring process to measure the percentage of model outputs that the system can successfully process. Gemma 3 models outperformed Gemini 2.5 Pro, with the 27B parameter version achieving the highest score of 99.8%. The smaller Gemma 3 4B also scored a high 99.01%, while Gemini 2.5 Pro scored 93.99%.
Chart highlighting the superior F1 scores of the Gemma-based Lettria Perseus
The open nature of Gemma 3 helped Lettria address many of the initial concerns held by the team. Clients and stakeholders feel more comfortable having sensitive data handled by Lettria’s Gemma 3-based solution. The framework Lettria used allows for more scalable improvements as well, meaning the team can add new data and use cases as needed instead of relying on out-of-the-box performance.
Choosing Gemma 3 also means more transparent and predictable expenses. In the benchmarking phase, the team’s dataset cost 86% more for Gemini 2.5 Pro than Gemma 3 4B, demonstrating a large savings margin that will add up over time as usage increases.
Chart comparing pricing between Gemma and Gemini 2.5 Pro
Each Gemma 3 model size is able to meet a different use case. The team found that the 27B parameter version is the slowest but the most accurate, the 4B version is fast but less accurate, and the 12B version offers a balance between the two. Lettria is presenting all three to clients so that they can choose which best suits their use case. Borderie noted that the client presentations so far have been a resounding success, adding, "The feedback is amazing."
We are targeting AI builders that have already launched a first generation of agents and are looking for new options to improve their reliability and performance with Gemma.
The team is eager to continue improving the Gemma 3 models behind Lettria Perseus, now focusing on optimizing them for inference. Already they are beating Gemini 2.5 Pro on processing time by anywhere from 2 to 9 seconds, but Borderie thinks they can take it further. “Our goal is to further optimize these models to bring these times below 5 seconds. This optimization will be key to making the model as cost-effective and energy-efficient as possible, improving upon its current, acceptable operational costs.” The team is working to expand the context window of the models as well, aiming for 128K tokens, which will allow greater scaling to ingest more comprehensive ontologies and larger documents.
Lettria is also developing an “Image2Graph” model that will translate images into graph structures and answer questions based on the graph data. The team plans to launch the model in the second half of 2026. “We are excited to launch our Image2Graph model, and we will continue to work with Gemma because we know it will evolve in the right directions: wider contexts, better performance, and continued developer-friendliness,” concluded Borderie.