Jump to Content

Technology

AlphaFold

AlphaFold is accelerating research in nearly every field of biology.

By solving a decades-old scientific challenge, our AI system is helping to solve crucial problems like treatments for disease or breaking down single-use plastics. One day, it might even help unlock the mysteries of how life itself works.

Building blocks of life

Inside every cell in your body, billions of tiny molecular machines called proteins are hard at work. They allow your eyes to detect light, your neurons to fire, and the unique ‘instructions’ in your DNA to be read. Think of them as the building blocks of life.

Watch

Currently, there are over 200 million known proteins, with many more found every year. Each one has a unique 3D shape determining how it works and what it does.

But figuring out the exact structure of a protein can sometimes take years and hundreds of thousands of dollars, meaning scientists were only able to study a tiny fraction of them. This slowed down research to tackle disease and find new medicines.

Visualization how many amino acids comprise a protein,  and how many proteins are found in the human body and on Earth

The protein-folding problem

If you could unravel a protein you would see that it’s like a string of beads made of a sequence of different chemicals known as amino acids. These sequences are assembled according to the genetic instructions of an organism's DNA.

Attraction and repulsion between the 20 different types of amino acids cause the string to fold in a feat of “spontaneous origami”. This forms the intricate curls, loops, and pleats of a protein’s 3D structure.

Experimental methods to determine the structure of proteins include nuclear magnetic resonance and X-ray crystallography. These rely on extensive trial and error, years of painstaking work, and multi-million-dollar specialized equipment.

So for decades, scientists tried to find a method to reliably determine a protein’s structure from its sequence of amino acids alone.

This grand scientific challenge is known as the protein-folding problem.

Visualization of a key unlocking the protein-folding problem

The AlphaFold solution

It took us four years to solve the protein-folding problem. We began work in 2016, almost immediately after AlphaGo’s victory against Lee Sedol, a top international Go player.

AlphaFold was taught by showing it the sequences and structures of around 100,000 known proteins.

CASP organizes a biennial challenge for research groups to test the accuracy of their protein structure predictions against real experimental data.

Teams are given a selection of amino acid sequences for proteins that have had their exact 3D shape mapped but have not yet been released into the public domain. Teams must submit their best predictions to see how close they are to the subsequently revealed structures.

At CASP13 in 2018, AlphaFold came first. At CASP14 in 2020, we presented AlphaFold 2– which demonstrated a level of accuracy so high that the community considered the protein–folding problem solved.

The AlphaFold 2 methods paper has already received more than 20,000 citations in the scientific literature. This puts it in the top 500 most-cited papers of all time, in any field.

Watch

In 2024, together with Isomorphic Labs, we introduced AlphaFold 3, which predicts the structure and interactions of all of life’s molecules.

AlphaFold 3 goes beyond proteins to a broad spectrum of biomolecules including DNA, RNA, and even small molecules, also known as ligands, which encompass many drugs. This leap could unlock more transformative science, from developing biorenewable materials and more resilient crops, to accelerating drug design and genomics research.

Sharing the power of AlphaFold

We are committed to sharing the widespread benefits of our AlphaFold technology with the research community.

We made our AlphaFold 2 predictions freely available to anyone in the scientific community.We’ve done this through the AlphaFold Protein Structure Database, in partnership with EMBL’s European Bioinformatics Institute – the flagship laboratory for life sciences in Europe. The Database builds upon decades of painstaking work done by scientists, using traditional methods to determine the structure of proteins.

Our first release – on 22 July, 2021 – covered over 350,000 structures, including the human proteome. That’s all of the ~20,000 known proteins expressed in the human body, along with the proteomes of 20 additional organisms important for biological research, including yeast, the fruit fly, and the mouse.

These organisms are central to modern biological research, including Nobel Prize winning discoveries like the discovery of insulin and life-saving drug development.

This will be one of the most important datasets since the mapping of the Human Genome.

Professor Ewan Birney
EMBL Deputy Director General and EMBL-EBI Director

This release dramatically expanded our knowledge of protein structures. It more than doubled the number of high-accuracy human protein structures available to scientists.

On 28 July, 2022, we expanded this database from nearly one million structures to over 200 million structures – including nearly all cataloged proteins known to science.

It has already been accessed by more than one million users in over 190 countries.

Most recently, we launched AlphaFold Server, a free and easy-to-use research tool powered by AlphaFold 3. AlphaFold Server is the most accurate tool in the world for predicting how proteins interact with other molecules throughout the cell. With just a few clicks on a single platform, biologists can generate molecular complexes – regardless of their access to computational resources or their expertise in machine learning.

Watch

AlphaFold Server

AlphaFold Server is a free and easy-to-use platform powered by AlphaFold 3

Explore AlphaFold Server

Open Source

AlphaFold Protein Structure Database

AlphaFold DB provides open access to over 200 million protein structure predictions to accelerate scientific research.

Visit database

Accelerating scientific discovery

So far, millions of researchers globally have used AlphaFold to accelerate progress on important real-world problems, including breaking down single-use plastics, solving biological puzzles and finding new malaria vaccines. By reducing the need for slow and expensive experiments, AlphaFold has potentially saved the research world hundreds of millions of researcher-years of progress – and trillions of dollars.

A quarter of research that makes use of AlphaFold is related to understanding and tackling diseases that cause millions of deaths globally. The Drugs for Neglected Diseases initiative is advancing drug discovery for neglected diseases, such as Chagas disease and leishmaniasis. These diseases impact millions of people, particularly within poor and vulnerable communities.

A team at the University of Cambridge is using AlphaFold to search for a more effective malaria vaccine, while at the University of Colorado, Boulder, another team is studying antibiotic resistance – a problem that results in nearly 3 million infections in the US alone each year.

The impacts of AlphaFold are realized through how it empowers scientists to accelerate discovery across open questions in biology and new lines of research. We’re just beginning to tap into AlphaFold’s potential and can’t wait to see what the future holds.