Genie 3 is a general-purpose world model. It uses simple text descriptions to generate photorealistic environments that can be explored in real-time.

Towards world simulation

World models use their deep understanding of physical environments to simulate them. Genie 3 represents a major leap in capabilities – allowing agents to predict how a world evolves, and how their actions affect it.

Genie 3 makes it possible to explore an unlimited range of realistic environments. This is a key stepping stone on the path to AGI – enabling AI agents capable of reasoning, problem solving, and real-world actions.

Create your own worlds

Project Genie is an experimental research prototype that lets you create and explore infinitely diverse worlds.

Capabilities

Genie 3 is the first real-time, interactive world model that generates photorealistic worlds from a simple text description.

Real-time

Allows for fluid, real-time interaction within the generated world, operating at 20-24 frames per second.

Interactive and controllable

Generates interactive worlds from text, transforming envisioned landscapes into controllable realities ready to be explored.

Photorealistic quality

Renders rich, photorealistic worlds at 720p resolution. This high-fidelity output provides crucial visual detail for training agents on real-world complexities.

World consistency and stability

Previously seen details are recalled when revisited – and environments can handle sustained interaction without degrading.





Advancing real-time interactivity

To achieve real-time controllability, Genie 3 has to recall previous environments and actions.

So, if the user is revisiting a location after a minute, the model needs to refer back to information from a minute ago. For real-time interactivity, this needs to happen multiple times per second in response to user instructions.



Pioneering promptable world events

Genie 3 enables a more expressive form of text-based interaction, called "promptable world events".

Promptable world events make it possible to change the generated world – such as altering weather conditions or introducing new objects and characters.

This increases the range of scenarios agents can use to learn about handling unexpected situations.

Effective prompting with Genie

Prompting Genie 3 involves two core elements: the world you want to build, and the character you're bringing to life.

Real-world applications

The potential uses for Genie 3 go well beyond gaming.

Genie 3’s realistic controllable realities could offer new ways for people to learn – allowing students to explore historical eras, like Ancient Rome. These simulated environments can also be used to train autonomous vehicles in realistic scenarios, in a completely safe setting.


Fueling embodied agent research

Prototyping training environments with Genie 3 and SIMA.

Genie 3 can maintain consistent worlds, making it possible to explore more complex goals, longer sequences of actions, and real-world complexities. It can also help researchers evaluate agents’ performance, and explore their weaknesses.

SIMA is an agent capable of carrying out tasks in virtual environments – we set it goals to complete within Genie 3. Genie 3 isn’t aware of the goal – but it simulates the future based on the agent's actions.


Limitations

Limited action space

Although promptable world events allow for a wide range of environmental interventions, they're not necessarily performed by the agent itself. For now, there's a limited range of actions agents can carry out.

Interaction and simulation of other agents

Accurately modeling interactions between multiple independent agents in shared environments is an ongoing research challenge.

Accurate representation of real-world locations

Genie 3 is currently unable to simulate real-world locations with perfect accuracy.

Text rendering

Clear and legible text is often only generated when it's in the input world description.

Limited interaction duration

The model can support a few minutes of continuous interaction, rather than extended hours.


Responsibility

We believe foundational technologies, like Genie 3, require a deep commitment to responsibility from the very beginning. Technical innovations, particularly open-ended and real-time capabilities, introduce new challenges for safety and responsibility. To address these unique risks while aiming to maximize the benefits, we have worked closely with our Responsible Development & Innovation Team.

At Google DeepMind, we're dedicated to developing our best-in-class models in a way that amplifies human creativity, while limiting unintended impacts. We continue to build our understanding of risks and their appropriate mitigations as we explore the potential applications for Genie 3, to develop this technology in a responsible way.


Acknowledgements

Genie 3 was made possible due to key research and engineering contributions from Phil Ball, Jakob Bauer, Frank Belletti, Bethanie Brownfield, Ariel Ephrat, Shlomi Fruchter, Agrim Gupta, Kristian Holsheimer, Aleks Holynski, Jiri Hron, Christos Kaplanis, Marjorie Limont, Matt McGill, Yanko Oliveira, Diego Rivas, Jack Parker-Holder, Frank Perbet, Guy Scully, Jeremy Shar, Stephen Spencer, Omer Tov, Ruben Villegas, Emma Wang and Jessica Yung.

We thank Andrew Audibert, Cip Baetu, Jordi Berbel, David Bridson, Jake Bruce, Gavin Buttimore, Sarah Chakera, Bilva Chandra, Kan Chen, Donghyun Cho, Yoni Choukroun, Paul Collins, Alex Cullum, Bogdan Damoc, Vibha Dasagi, Maxime Gazeau, Charles Gbadamosi, Liangke Gui, Shan Han, Woohyun Han, Ed Hirst, Tingbo Hou, Ashyana Kachra, Lucie Kerley, Siavash Khodadadeh, Kristian Kjems, Eva Knoepfel, Vika Koriakin, José Lezama, Jessica Lo, Cong Lu, Zeb Mehring, Alexandre Moufarek, Mark Murphy, Henna Nandwani, Valeria Oliveira, Joseph Ortiz, Fabio Pardo, Jane Park, Andrew Pierson, Ben Poole, Hang Qi, Helen Ran, Nilesh Ray, Tim Salimans, Manuel Sanchez, Igor Saprykin, Amy Shen, Sailesh Sidhwani, Duncan Smith, Joe Stanton, Hamish Tomlinson, Dimple Vijaykumar, Ruben Villegas, Luyu Wang, Will Whitney, Nat Wong, Rundi Wu, Keyang Xu, Minkai Xu, Nick Young, Yuan Zhong, Vadim Zubov.

Thanks to Tim Rocktäschel, Satinder Singh, Adrian Bolton, Inbar Mosseri, Aäron van den Oord, Douglas Eck, Dumitru Erhan, Raia Hadsell, Zoubin Gharamani, Koray Kavukcuoglu and Demis Hassabis for their insightful guidance and support throughout the research process.

Feature video was produced by Matthew Carey, Anoop Chaganty, Suz Chambers, Alex Chen, Jordan Griffith, Filip Havlena, Scotch Johnson, Randeep Katari, Hyeseung Kim, Kaloyan Kolev, Samuel Lawton, Cliff Lungaretti, Heysu Oh, Andrew Rhee, Shashwath Santosh, Arden Schager, JR Schmidt, Hana Tanimura, Khyati Trehan, Dev Valladares, Zach Velasco, Christopher Walker, Ben Wiley, Isabelle Wintaro, Jocelyn Zhao.

We thank Frederic Besse, Tim Harley and the rest of the SIMA team for access to a recent version of their agent.

Finally, we extend our gratitude to Mohammad Babaeizadeh, Gabe Barth-Maron, Parker Beak, Jenny Brennan, Tim Brooks, Max Cant, Harris Chan, Jeff Clune, Kaspar Daugaard, Dumitru Erhan, Ashley Feden, Simon Green, Nik Hemmings, Michael Huber, Jony Hudson, Dirichi Ike-Njoku, Hernan Moraldo, Bonnie Li, Simon Osindero, Georg Ostrovski, Ryan Poplin, Alex Rizkowsky, Giles Ruscoe, Ana Salazar, Guy Simmons, Jeff Stanway, Metin Toksoz-Exley, Xinchen Yan, Petko Yotov, Mingda Zhang and Martin Zlocha for their insights and support.