
The field, mapped honestly.
Twelve atlases. Each one a sit-down, not a tweet. Read them in any order; they don't depend on each other. Anti-hype, plain-language, anchored to specific papers and named labs.
History of AI
Symbolic AI → connectionism → transformers → frontier era. The chronology that explains why 2026 looks the way it does.
Transformer variants
Original transformer → GPT decoder-only → encoder-only BERT → encoder-decoder T5 → MoE → Mamba/SSM. The architectural family tree.
RLHF family
Supervised fine-tuning → PPO RLHF → DPO → KTO → ORPO. How models get aligned with human feedback.
Mechanistic interpretability
Circuits. Features. Activation steering. SAEs. Inside the black box, and the open research frontier.
Multimodal models
Vision-language. Audio. Video. Cross-modal grounding. How models handle more than text.
Embeddings
Vector representations. Why semantic search works. The substrate underneath every RAG application.
Hallucinations
Why models confidently lie. The taxonomy of failure modes. What mitigations actually work.
AI safety
The technical field. Alignment, oversight, red-teaming, dangerous-capability evals. Distinct from policy 'AI safety.'
Mixture of experts
Sparse models. Why MoE matters for inference cost. Mixtral, DeepSeek, GPT-4 family rumored architecture.
Context windows
How long-context works. Attention scaling. The 'lost in the middle' problem. Long-context evals.
How training actually works
Pretraining corpora. Compute. Hyperparameters. The mechanics of building a frontier model.
Post-training
Instruction tuning. RLHF. RLAIF. Tool-use post-training. The work that turns a base model into a useful product.
Agentic AI
What 'agents' actually are. ReAct, Toolformer, SWE-bench. Claude Code, Cursor, Devin, Operator, Computer Use. The workflow-vs-agent distinction that finally clarifies the space. Anti-hype.
Scaling laws
Kaplan 2020 → Chinchilla 2022 → inference-aware overtraining → o1 test-time scaling. How frontier-model labs decide N × D × FLOPs, and why GPT-5 / Claude 5 / Gemini 3 aren't 10× bigger by parameter count.
Benchmarks
MMLU · MMLU-Pro · GPQA Diamond · HumanEval · SWE-bench Verified · MMMU · AIME · LMSYS Arena · HELM · ARC-AGI. What each measures, what it doesn't, how to read a 2026 leaderboard without getting suckered.
RAG
Retrieval-augmented generation in 2026. Naive RAG → hybrid search → contextual retrieval → reranking → query rewriting → GraphRAG → agentic RAG → long-context. Eight architectures + eight vector DBs + six failure modes.
Quantization
How big models run on small hardware. FP32 → BF16 → FP8 → INT8 → INT4 → BitNet 1.58-bit. Methods: PTQ, QAT, GPTQ, AWQ, GGUF, AQLM, EXL2. What you lose, what you save, what to run where.
Inference
What actually happens when you call a model. Tokenization, prefill, decode, KV cache. FlashAttention, paged attention, continuous batching, speculative decoding, prompt caching, GQA/MLA. Six facts about cost.
Reasoning models
The o1/R1 paradigm. OpenAI o1 + o3, DeepSeek-R1, Gemini Thinking, Claude Extended Thinking. How inference-time-compute scaling works, what 'reasoning' actually means here, when to reach for these models.
Diffusion models
How image, video, and audio actually get generated. DDPM → latent diffusion → classifier-free guidance → flow matching. Stable Diffusion, Flux, DALL-E 3, Imagen 4, Nano Banana Pro (the engine that powers atomeons.com's hero imagery), Sora, Veo, MusicGen, Suno, Udio.