# Scaling Monosemanticity — Anthropic 2024 · Decoded · AtomEons

**Route:** `atomeons.com/research/decoded/scaling-monosemanticity`
**Category:** Decoded paper
**License:** CC-BY 4.0 unless otherwise noted on the page.

## Description

Anthropic showed that sparse autoencoders trained on Claude 3 Sonnet recover millions of interpretable features — and that you can causally manipulate them to change model behavior. The Golden Gate Claude demo was the public proof.

## Headings

- Scaling Monosemanticity — interpretability that works on a real model
- What the paper actually shows
- Why it matters
- What the scientists did
- What this paper does NOT claim
- What the field knows but rarely says

## Body

§ Decoded papers · Interpretability · production-scale

---

*Markdown export from atomeons.com. Full rendered page: https://atomeons.com/research/decoded/scaling-monosemanticity*
*All lab content is CC-BY 4.0. Cite as: AtomEons Systems Laboratory, Scaling Monosemanticity — Anthropic 2024 · Decoded · AtomEons, atomeons.com/research/decoded/scaling-monosemanticity.*