Benchmarks
Veo 3
Veo 3 has achieved state of the art results in head-to-head comparisons of outputs by human raters over top video generation models.
T2V Overall preference
Participants viewed 1,003 prompts and respective videos on MovieGenBench, a benchmark dataset released by Meta. Veo 3 performs best on overall preference.
T2V Text alignment
Participants viewed 1,003 prompts and respective videos on MovieGenBench, a benchmark dataset released by Meta. Veo 3 performs best on its capability to follow prompts accurately.
T2V Visual quality
Participants viewed 1,003 prompts and respective videos on MovieGenBench, a benchmark dataset released by Meta. Participants rate the visual quality of Veo’s outputs more highly than other models.
I2V Overall preference
When participants viewed 355 image and text pairs from the VBench I2V benchmark, Veo 3’s outputs were preferred overall compared to other models.
I2V Text alignment
When participants viewed 355 image and text pairs from the VBench I2V benchmark, Veo 3’s outputs were preferred to other models for capturing the intent of the prompt.
I2V Visual quality
When participants viewed 355 image and text pairs from the VBench I2V benchmark, Veo 3’s outputs were preferred overall to other models for the visual quality.
T2VA Audio visual overall preference
Participants viewed 527 prompts from MovieGenBench, and had an overall preference for Veo’s outputs with audio over other models.
T2VA Audio-video alignment
Participants viewed 527 prompts from MovieGenBench, and chose Veo 3’s outputs over other models for having audio that is better synchronized with the video content.
T2V Visually realistic physics
Participants choose Veo 3’s outputs over other models for having visually realistic physics on the physics subset of MovieBench prompts.
Veo 2
Veo 2's reference powered video generation and guided motion capabilities have achieved state of the art results in head-to-head comparisons of outputs by human raters on internal benchmarks.
Reference Powered Video
Human raters conducted direct side-by-side comparisons across 135 diverse examples, evaluating 3 generated videos per example. Findings indicate Veo's reference powered video generation capability performs best compared to other leading video generation models for subject consistency and visual quality.
Motion control
Human raters conducted direct side-by-side comparisons of Veo's guided motion outputs against Kling’s motion brush capability. Across 80 diverse examples, the raters indicate Veo's guided motion capability performs best for visual quality, motion adherence, and overall image to video preference.