DeepMind and Blizzard to release StarCraft II as an AI research environment

Today at BlizzCon 2016 in Anaheim, California, we announced our collaboration with Blizzard Entertainment to open up StarCraft II to AI and Machine Learning researchers around the world.

For almost 20 years, the StarCraft game series has been widely recognised as the pinnacle of 1v1 competitive video games, and among the best PC games of all time. The original StarCraft was an early pioneer in eSports, played at the highest level by elite professional players since the late 90s, and remains incredibly competitive to this day. The StarCraft series’ longevity in competitive gaming is a testament to Blizzard’s design, and their continual effort to balance and refine their games over the years. StarCraft II continues the series’ renowned eSports tradition, and has been the focus of our work with Blizzard.

DeepMind is on a scientific mission to push the boundaries of AI, developing programs that can learn to solve any complex problem without needing to be told how. Games are the perfect environment in which to do this, allowing us to develop and test smarter, more flexible AI algorithms quickly and efficiently, and also providing instant feedback on how we’re doing through scores.

Over the past five years we’ve helped to pioneer the use of games as AI research environments to drive our machine learning and reinforcement learning research forwards, from 2D games in Atari, to full 3D environments such as Torcs, mastering the game of Go, or our forthcoming DeepMind Labyrinth. Here's a representation of what these research environments have looked like with L-R, Atari and Labyrinth.

StarCraft is an interesting testing environment for current AI research because it provides a useful bridge to the messiness of the real-world. The skills required for an agent to progress through the environment and play StarCraft well could ultimately transfer to real-world tasks.

At the start of a game of StarCraft, players choose one of three races, each with distinct unit abilities and gameplay approaches. Players’ actions are governed by the in-game economy; minerals and gas must be gathered in order to produce new buildings and units. The opposing player builds up their base at the same time, but each player can only see parts of the map within range of their own units. Thus, players must send units to scout unseen areas in order to gain information about their opponent, and then remember that information over a long period of time. This makes for an even more complex challenge as the environment becomes partially observable - an interesting contrast to perfect information games such as Chess or Go. And this is a real-time strategy game - both players are playing simultaneously, so every decision needs to be computed quickly and efficiently.

An agent that can play StarCraft will need to demonstrate effective use of memory, an ability to plan over a long time, and the capacity to adapt plans based on new information. Computers are capable of extremely fast control, but that doesn’t necessarily demonstrate intelligence, so agents must interact with the game within limits of human dexterity in terms of “Actions Per Minute”. StarCraft’s high-dimensional action space is quite different from those previously investigated in reinforcement learning research; to execute something as simple as “expand your base to some location”, one must coordinate mouse clicks, camera, and available resources. This makes actions and planning hierarchical, which is a challenging aspect of Reinforcement Learning.

We’re particularly pleased that the environment we’ve worked with Blizzard to construct will be open and available to all researchers next year. We recognise the efforts of the developers and researchers from the Brood War community in recent years, and hope that this new, modern and flexible environment - supported directly by the team at Blizzard - will be widely used to advance the state-of-the-art.

We’ve worked closely with the StarCraft II team to develop an API that supports something similar to previous bots written with a “scripted” interface, allowing programmatic control of individual units and access to the full game state (with some new options as well). Ultimately agents will play directly from pixels, so to get us there, we’ve developed a new image-based interface that outputs a simplified low resolution RGB image data for map & minimap, and the option to break out features into separate “layers”, like terrain heightfield, unit type, unit health etc. Below is an example of what the feature layer API will look like.

We are also working with Blizzard to create “curriculum” scenarios, which present increasingly complex tasks to allow researchers of any level to get an agent up and running, and benchmark different algorithms and advances. Researchers will also have full flexibility and control to create their own tasks using the existing StarCraft II editing tools.

We’re really excited to see where our collaboration with Blizzard will take us. While we’re still a long way from being able to challenge a professional human player at the game of StarCraft II, we hope that the work we have done with Blizzard will serve as a useful testing platform for the wider AI research community.