AtomEons / Research / Decoded / AlphaFold 2

2021 · Nature 596 · Jumper, Evans, Pritzel, Green, Figurnov et al. · DeepMind · Won 2024 Nobel Prize in Chemistry

A fifty-year-old problem, solved.

In one sentence: DeepMind built an AI that takes a protein's linear amino acid sequence as input and predicts its three-dimensional folded structure with near-experimental accuracy — solving the protein-folding problem that biologists had grappled with since 1972.

01 · Why this matters to your life

Proteins do the actual work in biology. Every drug works by affecting a protein. Every disease has protein dysfunction at its core. Understanding what a protein looks like in three dimensions is the foundation of modern medicine — but until 2021, figuring out the 3D shape of a single new protein could take years of expensive experimental work.

AlphaFold 2 made the problem essentially free. The DeepMind team released structures for over 200 million proteins — nearly every known protein in nature — in 2022. The AlphaFold Protein Structure Database is now used by approximately 2 million researchers worldwide. Drug discovery, vaccine design, enzyme engineering, basic biology — every field touching proteins now starts with AlphaFold predictions and uses experimental verification only where it matters most.

Demis Hassabis (CEO of DeepMind) and John Jumper (the paper's lead) shared the 2024 Nobel Prize in Chemistry for this work, alongside David Baker of UW (for protein design). It is the most consequential AI-for-science breakthrough to date.

02 · What scientists actually did

The protein-folding problem is this: a protein is a chain of amino acids. The chain folds into a specific 3D shape determined by the physics of how those amino acids interact. The shape determines the protein's function. Predicting the shape from the sequence is a quantum-mechanical problem of staggering complexity if you try to simulate it from physics directly.

DeepMind's insight: don't simulate the physics. Train a neural network to predict the structure from evolutionary patterns. Every protein in nature has been refined by evolution, and proteins with similar sequences tend to fold into similar shapes. The model — a transformer-like architecture using multi-sequence alignments and learning structural geometric constraints — learned the pattern.

At the 2020 CASP14 protein-folding competition (a biennial blind test for predictions against experimentally-determined structures), AlphaFold 2's predictions matched experimental data with accuracy approaching the experimental error itself. The competition organizer announced the protein-folding problem had been solved. That had never been said before.

03 · What scientists know but rarely say

AlphaFold 2 works because evolution did the hard part. The model learned from ~170,000 experimentally-determined protein structures (the Protein Data Bank) plus the alignment patterns of related sequences from millions of evolved proteins. It generalizes to new sequences because evolution has constrained the space of possible structures enough for a neural network to learn the constraints. The model does not understand the physics — it has memorized the patterns physics produces.

The honest limitation: AlphaFold 2 predicts the most likely fold under physiological conditions. Many proteins do not have a single fold — they shift between conformations as part of their function, or only fold when bound to a partner, or remain partially disordered. The model says less about these cases. The 2024 AlphaFold 3 paper extended the technique to protein-DNA, protein-RNA, and protein-small-molecule complexes, but the limitation around dynamic / multi-state proteins is still partial.

The other unstated reality: this is not a closed problem. The shift from “static structure prediction” to “protein dynamics + function prediction + drug design” is the active research frontier. Companies like Isomorphic Labs (DeepMind's drug-discovery spinout) are building on AlphaFold to actually design new therapeutics. The breakthrough opened a new field; it did not finish it.

04 · What the paper does NOT claim

The paper does not claim AlphaFold designs new proteins. It predicts the structures of existing ones. Protein design — creating new sequences for new functions — is a related but distinct problem. David Baker's RoseTTAFold work and the related diffusion-protein-design papers (RFdiffusion, ProteinMPNN) address that. The 2024 Nobel went to both Hassabis/Jumper (prediction) and Baker (design) precisely because both are necessary.

The paper also does not claim its predictions are perfect. It reports per-residue confidence scores. High-confidence predictions match experiment beautifully. Low-confidence predictions (often disordered or flexible regions) should not be trusted. Researchers using AlphaFold know to check the confidence scores; press coverage often does not.

05 · Read the original

· Nature 596, 583–589 (2021) — the original Nature paper. Published August 2021.
· alphafold.ebi.ac.uk — the public database with 200M+ predicted structures. Free to use. Two million researchers and counting.
· Abramson et al. 2024 (AlphaFold 3) — the extension to protein complexes with DNA, RNA, ligands. Nature 630.
· 2024 Nobel Prize in Chemistry announcement (Oct 9, 2024).

← decoded index