arXiv decoded · plain English · for normal humans

The papers that built modern AI, without the jargon.

Every breakthrough in modern AI lives in a paper most people will never read. Forty pages of math. Three columns of dense LaTeX. Citations to thirty other papers most people haven't read either. This is where we translate them.

Each page gives you the one-sentence summary, why it matters to your life, what the scientists actually know but don't always say in the abstract, and an honest read on what the paper does and does not claim.

Every entry cites the real arXiv paper. Original authors retain credit. CC-BY 4.0 for our commentary. No paper reproduced beyond fair-use quote.

01 · 2012

arXiv:NIPS 2012

ImageNet Classification with Deep Convolutional Neural Networks (AlexNet)

Krizhevsky, Sutskever, Hinton

In one sentence: Three Toronto researchers used graphics-card chips and a deep neural network to crush every competitor at image recognition by such a margin that the entire field abandoned its prior methods within twelve months.

Why it matters: The paper that started modern AI. Hinton won the 2024 Nobel Prize partially for this. Sutskever co-founded OpenAI. Without AlexNet, no ChatGPT, no Claude, no Gemini.

Decode the paper →

02 · 2017

arXiv:1706.03762

Attention Is All You Need

Vaswani et al.

In one sentence: The 2017 paper that made every modern AI possible by introducing the transformer architecture — a new way for computers to read sentences that doesn't require reading them word-by-word.

Why it matters: ChatGPT, Claude, Gemini, every AI you've heard of in 2026 traces its blood back to this one paper. Eight engineers at Google.

Decode the paper →

03 · 2020

arXiv:2001.08361

Scaling Laws for Neural Language Models

Kaplan et al. (OpenAI)

In one sentence: The empirical discovery that making AI models bigger keeps making them better in a predictable mathematical way — and that nothing in the data suggests the improvement stops.

Why it matters: This is why the entire AI industry has spent ~$500B in five years buying more GPUs. Without this paper, the scale-up bet would have been speculation. With it, the bet became math.

Decode the paper →

04 · 2020

arXiv:2005.14165

Language Models are Few-Shot Learners (GPT-3)

Brown et al. (OpenAI)

In one sentence: OpenAI trained a 175-billion-parameter language model and discovered that at sufficient scale it could learn new tasks from a handful of examples in the prompt, without any retraining.

Why it matters: The moment AI started 'talking.' Two years later, ChatGPT launched. Five years later, the AI industry has $500B+ in valuation. This paper is the inflection point.

Decode the paper →

05 · 2021

arXiv:Nature 596

Highly Accurate Protein Structure Prediction with AlphaFold

Jumper et al. (DeepMind)

In one sentence: DeepMind built an AI that takes a protein's amino acid sequence and predicts its 3D folded structure with near-experimental accuracy — solving a 50-year grand challenge in biology.

Why it matters: Hassabis and Jumper won the 2024 Nobel Prize in Chemistry for this. 200M+ protein structures predicted and freely available. Every modern drug-discovery pipeline starts here.

Decode the paper →

06 · 2022

arXiv:2203.02155

Training Language Models to Follow Instructions (InstructGPT / RLHF)

Ouyang et al. (OpenAI)

In one sentence: OpenAI showed that contracted human raters ranking AI outputs could teach the model to follow instructions, write helpfully, and refuse harmful content — turning raw GPT-3 into shippable ChatGPT.

Why it matters: The bridge between research-grade AI and consumer-grade AI. The technique behind ChatGPT, Claude, Gemini, Grok — every aligned model since uses some variant of this.

Decode the paper →

07 · 2022

arXiv:2201.11903

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Wei et al. (Google Brain)

In one sentence: Asking an AI to 'think step by step' before answering makes it dramatically better at math, logic, and complex reasoning — the model already could reason, but you had to ask the right way.

Why it matters: Every modern AI now reasons step-by-step under the hood. This paper turned a clever prompting trick into the foundation of o1, o3, Claude Extended Thinking, Gemini Thinking — the entire 'reasoning models' category.

Decode the paper →

08 · 2022

arXiv:2112.10752

High-Resolution Image Synthesis with Latent Diffusion Models

Rombach et al. (LMU Munich, Runway)

In one sentence: German researchers figured out how to compress the image-generation problem into a 64x smaller mathematical space, making high-quality AI image generation tractable on consumer hardware.

Why it matters: Stable Diffusion, Flux, Nano Banana Pro, DALL-E 3, Imagen — every AI image you've seen since 2022 descends from this architecture. The atomeons.com hero photography included.

Decode the paper →

09 · 2022

arXiv:2212.08073

Constitutional AI: Harmlessness from AI Feedback

Bai et al. (Anthropic)

In one sentence: An AI can be trained to behave well by having a second AI grade its own answers against a written constitution — instead of needing thousands of humans to label every response.

Why it matters: This is the technique behind Claude's safety training. It's also how every AI lab now scales safety without scaling human labelers. Without it, alignment work would still be a human-bottlenecked process.

Decode the paper →

10 · 2023

arXiv:2309.08600

Sparse Autoencoders Find Highly Interpretable Features in Language Models

Cunningham et al. (Anthropic, EleutherAI)

In one sentence: A mathematical technique for taking a fully-trained AI and revealing the specific concepts it has learned — like an X-ray for the inside of a model.

Why it matters: For years AIs were called 'black boxes' because nobody could see what they thought. This paper started the work that lets us peek inside. Anthropic's interpretability team has since identified millions of distinct concepts inside Claude.

Decode the paper →

How we read them

Six questions per paper.

01
What is this paper's one sentence? The actual claim, in plain English.
02
Why does this matter to a normal person? The concrete consequence beyond the lab.
03
What did the scientists actually do? The method, translated. No equations.
04
What do they know but don't say? The implicit knowledge between the lines.
05
What does this paper NOT claim? Honest read on the hype-vs-reality boundary.
06
Where do I read the original? Direct arXiv link, plus what to read next.