built throughORANGEBOX·see what it ships·$1 →
Dark woven fabric with three bio-cyan threads in a geometric pattern — jargon is just woven language.

AtomEons / Learn / decode / jargon

AI buzzword decoder: what each term actually means in 2026

A senior-person field guide. Marketing meaning, technical reality, and when the term is being used dishonestly.

Most AI vocabulary in 2026 is contested. Vendors use the same word to mean three different things in three different paragraphs of the same landing page. Researchers use it to mean a fourth thing. Press picks the most exciting one. By the time the term reaches a procurement committee or a board deck, it has lost almost all definitional weight, and the only honest answer to "what does this mean?" is "depends who is selling it." This page is a working glossary. For each term we give the plain-language marketing meaning (what you will hear in a pitch), the technical reality (what the paper or system actually does), and the dishonest pattern (when the word is being used to inflate, dodge, or distract). Where a real benchmark, paper, or official spec exists, we cite it. Where the field genuinely disagrees, we say so. Where we don't know something with certainty as of mid-2026, we mark it best-effort. The voice we are aiming for is the voice of a tired senior engineer at the back of the room. Nothing here is anti-AI. It is anti-bullshit. The technology is real, the progress is real, the wins are real, and so are the rugs, the redefinitions, the goalpost shifts, and the press-release physics that make everything sound bigger than it is. If you only read one section, read the one on AGI, ASI, and superintelligence. Those three words do more work to confuse otherwise rational discussions than the rest of the glossary combined. After that, the table is the workhorse: scan for the term you need, get the technical reality in one line, and move on. Everything else here is for the moments when somebody on a call says a word with a lot of confidence and you need to know whether to push back.

The three most-abused words: AGI, ASI, superintelligence

These three terms have no fixed technical definition and are used to mean very different things by different speakers. They are the most common source of pitch-deck inflation in the field.

AGI (artificial general intelligence)

No agreed definition

Marketing meaning: a system that can do any cognitive task a human can. Technical reality: there is no operational test for AGI that the field agrees on. OpenAI's stated mission references AGI, and their 2023 'Planning for AGI' essay defines it loosely as 'AI systems that are generally smarter than humans.' DeepMind's Morris et al. (2023) proposed a level-based framework (emerging, competent, expert, virtuoso, superhuman) precisely because there was no consensus. Dishonest use: claiming a system is 'approaching AGI' or 'AGI-level' to imply human-equivalent breadth when the system has only been benchmarked on narrow tasks.

ASI (artificial superintelligence)

Speculative term

Marketing meaning: AI smarter than humans across all domains, often used interchangeably with superintelligence. Technical reality: a Bostrom-era philosophy term (Superintelligence, 2014) for a hypothetical system surpassing the best human minds in every cognitive task. No system in 2026 meets this definition by any rigorous measure. Dishonest use: timelines for ASI within a stated number of years, presented as forecasts when they are actually opinions of company leadership without a defined yardstick.

Superintelligence

Often a synonym for ASI

Marketing meaning: same as ASI in most current usage. Technical reality: Bostrom distinguished speed superintelligence (think faster), collective superintelligence (more minds in parallel), and quality superintelligence (better cognition per unit). Most 2026 usage collapses all three into one word. Dishonest use: when a company launches a 'superintelligence team' and the deliverable is a chatbot benchmark improvement. The word is doing PR work, not technical work.

The glossary, in one scrollable table

Quick scan for the most common terms. Marketing meaning is what you will hear in a pitch. Technical reality is what the system is actually doing. Dishonest tell is the specific pattern where the term is being misused.

TermAlignment
Marketing meaningMaking AI 'safe' and 'good'
Technical realitySpecific techniques: RLHF, constitutional AI, RLAIF, debate, scalable oversight. Each has known limits. See Bai et al. 2022, Christiano et al. 2017.
Dishonest tellUsed as a vibe word with no mention of which technique, which failure modes, or which benchmark.
TermAgent
Marketing meaningAn AI that 'does things for you'
Technical realityA loop that takes a goal, picks tools or actions, observes outcomes, and iterates. ReAct (Yao et al. 2022) is the canonical pattern.
Dishonest tellSingle prompt-to-completion calls relabeled as 'agents' to ride the category.
TermAgentic
Marketing meaningAdjective form of agent
Technical realityDescribes systems with multi-step planning and tool use. Anthropic's Claude SDK and OpenAI's Assistants API document specific agent loops.
Dishonest tellAnything that 'has agency' even when it's a static workflow with no decision points.
TermAutonomous
Marketing meaningRuns without a human
Technical realityOperates in a defined loop with bounded actions and rollback. Real autonomy in production usually has guardrails, kill switches, and human escalation.
Dishonest tell'Fully autonomous' when there is a human in the loop on every meaningful action.
TermFoundation model
Marketing meaningA 'big base model'
Technical realityTerm from Stanford CRFM 2021 (Bommasani et al.) for large models trained on broad data, adaptable to many tasks via fine-tuning or prompting.
Dishonest tellUsed to mean any LLM, blurring the distinction between a 7B local model and a frontier-scale system.
TermFrontier model
Marketing meaningThe newest, biggest models
Technical realityUK AI Safety Institute and Anthropic/OpenAI/Google policy docs use this for the most capable general-purpose models at the research frontier. No fixed parameter count.
Dishonest tellMarketing slides calling a small fine-tuned model 'frontier' because it scores well on one benchmark.
TermMulti-modal
Marketing meaningHandles text, images, audio, video
Technical realityA model with input or output across multiple data modalities. GPT-4V, Gemini, Claude 3 family all multi-modal in different ways.
Dishonest tellCounting text + a single image upload as 'fully multi-modal' when the system can't reason across video or audio.
TermMulti-agent
Marketing meaningMany AIs working together
Technical realityArchitectures where multiple model instances or specialized agents coordinate. Microsoft AutoGen, LangGraph, CrewAI are public frameworks.
Dishonest tellCalling a single model with multiple prompts 'a multi-agent system' to inflate complexity.
TermEmergent
Marketing meaningCapabilities that appear at scale
Technical realityOriginal Wei et al. 2022 'Emergent Abilities of LLMs' paper. Later contested by Schaeffer et al. 2023 NeurIPS best paper showing many emergent claims were metric artifacts.
Dishonest tellCiting emergence as a magic property without acknowledging the Schaeffer rebuttal.
TermScaling laws
Marketing meaningBigger = better, predictably
Technical realitySpecific power-law relationships from Kaplan et al. 2020 and Hoffmann et al. 2022 (Chinchilla). Loss decreases predictably with compute, data, parameters.
Dishonest tellClaiming scaling 'guarantees' AGI. Scaling laws are about loss curves, not capability ceilings.
TermRAG
Marketing meaningLets AI use your data
Technical realityRetrieval-augmented generation. Lewis et al. 2020. Fetch relevant docs, stuff into context, generate. Real systems add reranking, chunking, hybrid search.
Dishonest tell'AI trained on your data' when the system is just doing vanilla RAG at query time, no training involved.
TermFine-tune
Marketing meaningCustom model for your use case
Technical realityContinued training on a smaller, task-specific dataset. LoRA (Hu et al. 2021) is the dominant parameter-efficient approach.
Dishonest tellCalling a system prompt 'a fine-tune' to imply deeper customization than exists.
TermVector database
Marketing meaningDatabase for AI
Technical realityStorage and ANN search over embeddings. Pinecone, Weaviate, pgvector, Qdrant, Milvus, Chroma. HNSW and IVF are common index types.
Dishonest tellPositioning a vector DB as 'the AI layer' when it is one component of a retrieval pipeline.
TermSemantic search
Marketing meaningSearch by meaning, not keywords
Technical realitySearch over dense vector embeddings using cosine or dot-product similarity. Usually paired with BM25 in hybrid setups for real-world quality.
Dishonest tellPure-vector demos that fall apart on exact-match queries (SKUs, names, IDs) which BM25 would handle.
TermEmbedding
Marketing meaningAI's representation of text
Technical realityA fixed-length dense vector from a learned encoder (e.g., OpenAI text-embedding-3, Cohere, BGE, E5). Captures semantic similarity in vector space.
Dishonest tellTreating embeddings as interpretable features. They aren't, beyond similarity.
TermTransformer
Marketing meaningThe neural net behind modern AI
Technical realityArchitecture from Vaswani et al. 2017 'Attention Is All You Need.' Self-attention + feedforward layers. Decoder-only is now dominant for LLMs.
Dishonest tellSaying 'transformer-based' as a quality signal. Almost everything is. It's table stakes.
TermGPT
Marketing meaningOpenAI's models
Technical realityGenerative Pretrained Transformer. A specific OpenAI lineage (GPT-2, 3, 3.5, 4, etc.) but also used generically for decoder-only LMs.
Dishonest tellCalling a Llama or Mistral fine-tune 'a GPT.' Wrong family, wrong vendor.
TermHallucination
Marketing meaningAI making things up
Technical realityConfident but incorrect or unsupported output. Survey: Ji et al. 2023 'Survey of Hallucination in NLG.' Cannot be eliminated, only reduced.
Dishonest tellVendors claiming 'hallucination-free' systems. No such thing in 2026 for open-domain generation.
TermGround truth
Marketing meaningThe correct answer
Technical realityReference labels in a dataset used for evaluation. Quality depends entirely on the labeling process.
Dishonest tellTreating LLM-generated labels as ground truth without human spot-checks.
TermReasoning
Marketing meaningThe AI thinks
Technical realityMulti-step output that decomposes a problem. OpenAI o1/o3 and similar models trained with RL to produce long chains. No agreed mechanistic definition of 'reasoning.'
Dishonest tellAny chain-of-thought output relabeled as 'reasoning' regardless of whether the steps are valid.
TermChain-of-thought
Marketing meaningStep-by-step thinking
Technical realityPrompting technique from Wei et al. 2022. Asking the model to show intermediate steps before final answer. Improves performance on math and logic.
Dishonest tellClaiming CoT 'proves' the model reasons. The steps may be post-hoc rationalizations (Turpin et al. 2023).
TermMulti-step
Marketing meaningMore than one action
Technical realityA workflow with sequential stages, often with tool calls between them. Operationally distinct from single-shot completion.
Dishonest tellUsed interchangeably with 'agentic' and 'reasoning' to suggest sophistication.
TermDeep research
Marketing meaningAI does the research for you
Technical realitySpecific product category: long-horizon search + read + synthesize loops. OpenAI Deep Research, Google Gemini Deep Research, Perplexity Pro. Real benchmarks scarce.
Dishonest tellVendors using the phrase without specifying source coverage, citation accuracy, or how hallucinations are handled.
TermTool use
Marketing meaningAI calls APIs
Technical realityModel emits structured calls to external functions; runtime executes and returns results. Toolformer (Schick et al. 2023) was an early formulation.
Dishonest tellDemo videos that hide failure rates. Real tool-use systems have non-trivial error and retry needs.
TermFunction calling
Marketing meaningSpecific form of tool use
Technical realityStructured output (typically JSON) matching a defined schema, executed by the runtime. OpenAI introduced as 'function calling' in 2023; now industry standard.
Dishonest tellTreating function calling as a security boundary. It is not — the runtime decides what to execute.
TermMCP
Marketing meaningModel Context Protocol
Technical realityAnthropic's open spec (modelcontextprotocol.io, 2024) for connecting AI assistants to data sources and tools via standardized servers.
Dishonest tellCalling any AI integration 'MCP-based' when it's a custom API. MCP is a specific protocol.
TermAmbient AI
Marketing meaningAI that's just there
Technical realityTerm used for always-on, context-aware assistants. No standard spec; usage varies by vendor.
Dishonest tellMarketing word for 'we put the assistant in more places.' Usually means more telemetry, not more capability.
TermAmbient agents
Marketing meaningAgents that run in the background
Technical realityLangChain and others have published patterns: agents triggered by events (email, calendar) rather than user prompts. Documented in LangChain blog and OSS examples.
Dishonest tellPromising 'background work' that turns out to be scheduled cron jobs with an LLM call.

Why the AGI debate is mostly definitional

When two people argue about whether GPT-4 or Claude or Gemini is 'AGI,' they are almost always using different definitions. The Turing test, the coffee test (Wozniak), the employment test (Mitchell), the unified theory test (Goertzel), and the OpenAI charter language all measure different things. Morris et al. (2023) at DeepMind tried to fix this with a matrix: levels (emerging through superhuman) by generality (narrow vs. general). It is one of the cleaner frameworks but is not universally adopted. The practical advice for non-researchers is to refuse the term and ask three replacement questions. First: on which specific benchmarks does the system match or exceed human baselines, and what are the failure modes outside the benchmark distribution? Second: what is the system's reliability on novel tasks it was not trained on, measured how? Third: what is the cost of a wrong answer in the deployment context, and how is that bounded? These three questions sidestep the entire definitional fight and get to what actually matters for any real decision. A related move is to ask about generalization vs. memorization. A system that scores 95 percent on a benchmark whose data is in the pretraining corpus is doing something very different from a system that scores 70 percent on a held-out benchmark released after the training cutoff. The benchmark contamination literature (Sainz et al. 2023, Magar and Schwartz 2022) is the technical backstop for this question. If a vendor cannot or will not show held-out evaluation, that is informative on its own.

The alignment vocabulary, decoded

Alignment is the most-used and least-specified word in the safety conversation. Here is the actual technique stack underneath the word, with the original references where they exist.

  • RLHF (Reinforcement Learning from Human Feedback) — Christiano et al. 2017, Ouyang et al. 2022 (InstructGPT). Train a reward model on human preference pairs, then PPO the base model against that reward. The dominant alignment technique through 2023.
  • RLAIF (Reinforcement Learning from AI Feedback) — Lee et al. 2023 (Google). Use a stronger model to generate preference labels instead of humans. Cheaper, debated quality.
  • Constitutional AI — Bai et al. 2022 (Anthropic). Replace some or all human feedback with a written set of principles; the model critiques and revises its own outputs against the constitution.
  • DPO (Direct Preference Optimization) — Rafailov et al. 2023. Mathematically equivalent reformulation of RLHF that skips the explicit reward model. Now widely adopted.
  • Scalable oversight — research direction (debate, recursive reward modeling, weak-to-strong generalization). The bet is that humans can supervise models smarter than themselves with the right protocol. See Burns et al. 2023 (OpenAI), Irving et al. 2018.
  • Interpretability — separate research direction. Sparse autoencoders (Anthropic, 2024), mechanistic interpretability (Olah et al.), circuits research. Goal is to understand what a model is doing internally, not just shape its outputs.
  • Red teaming — adversarial testing. Now standard practice at major labs and increasingly required by policy (US Executive Order 14110 of 2023, EU AI Act high-risk obligations).

Hallucination: what it is, what it isn't, why 'eliminated' is a lie

Hallucination is not a bug to be patched. It is a property of how decoder-only language models work: they generate the next token that maximizes a learned distribution, with no built-in ground-truth check. Even a model that scores 100 percent on a benchmark can confabulate when asked something outside the benchmark distribution. The useful framing from Ji et al. 2023 distinguishes intrinsic hallucinations (contradicting the input or context) from extrinsic hallucinations (contradicting external world facts that aren't in the input). RAG addresses some intrinsic cases. Tool use and citation requirements address some extrinsic cases. Neither eliminates them. The literature is consistent that the floor is non-zero for open-domain generation. When a vendor pitches 'hallucination-free' AI, the honest interpretation is: bounded domain, narrow query distribution, post-hoc verification, or all three. Ask which. If the answer is fuzzy, the claim is marketing. Also worth knowing: ablation studies from 2023-2025 show RAG can introduce new hallucinations by including misleading retrieved passages, so 'we use RAG' is not by itself a safety story.

Tool use, function calling, MCP: a short timeline of how they differ

These three terms get conflated regularly. They are sequential layers of standardization, not synonyms.

  1. 2022 — Toolformer

    Tool use as a research concept

    Schick et al. (Meta AI) showed a language model could be trained to call APIs (calculator, search, translation) and use the results. Foundational paper for the modern category.

  2. 2023 — Function calling

    OpenAI standardizes the schema

    OpenAI launched 'function calling' in GPT-3.5/4: the developer defines a JSON schema, the model emits structured calls matching it, the developer's runtime executes. Became the industry pattern; Anthropic, Google, Mistral followed with their own implementations.

  3. 2024 — MCP launched

    Anthropic ships an open protocol

    Model Context Protocol (modelcontextprotocol.io) is an open spec for how AI assistants connect to data sources and tools via standardized servers. Reduces the M×N integration problem (M assistants, N data sources) to M+N.

  4. 2025-2026 — Ambient agents emerge

    Event-driven agent loops

    Tool use plus background triggers (email, webhook, calendar event, schedule). LangChain, LangGraph, and others published patterns; multiple vendors now ship 'ambient' or 'background' agent products. The category is real but the marketing is ahead of the reliability data.

Three rules of thumb when reading any AI claim

Internal rubric for parsing marketing material from press releases, vendor blogs, and pitch decks. Sentence-case, no jargon, works for anyone non-technical.

Name the benchmark or shut up

Rule 1

If a claim does not name a specific public benchmark with a specific score, treat it as marketing until proven otherwise. Real benchmarks: MMLU, HumanEval, GPQA, SWE-bench, ARC-AGI, BIG-Bench. Vague 'state-of-the-art' or 'best-in-class' phrases are not benchmark results. Held-out evaluation matters more than headline numbers (see benchmark contamination literature).

Ask 'compared to what'

Rule 2

Any percentage improvement is meaningless without a named baseline and confidence interval. '90 percent accurate' tells you nothing if you don't know what the previous accuracy was, what humans score, what random would score, and how the test set was constructed. The Schaeffer et al. 2023 critique of emergent abilities is a clean example of how baseline choice changes the story entirely.

Demand failure modes

Rule 3

Any honest AI vendor will name the failure modes of their system, the rate at which they occur, and how the runtime handles them. If the answer is 'we don't fail' or 'edge cases are rare,' that is the dishonest tell. Anthropic, OpenAI, and Google all publish model cards documenting known failures. The presence or absence of a model card is itself a signal.

Terms we left out, and why

This page focused on terms that show up in pitches, board decks, and procurement. We deliberately skipped several adjacent categories. Inference-time compute (mostly used inside research, not marketing) is covered well by the OpenAI o1 system card and follow-up papers. Mixture-of-experts (MoE) is an architecture detail rather than a marketing term, though Mistral's Mixtral and Google's switch-transformer work are public references. World models (a Yann LeCun term, distinct from generative video models) and JEPA (Joint Embedding Predictive Architecture, also LeCun) are research positions, not products you can buy today. We also left out the safety vocabulary (catastrophic risk, x-risk, AI safety, dangerous capabilities) because that is its own glossary and would dilute the practical focus here. The UK AI Safety Institute, the US AI Safety Institute (NIST), and the EU AI Act use overlapping but non-identical definitions; if you are working in a regulated context, read those primary sources rather than any vendor summary. When in doubt, the single best move is to read the original paper. Almost every term on this page traces to a specific arXiv paper or a specific company blog post. The originals are usually clearer than the press coverage. If a term has no original source, that itself is informative.

Sources

  1. [01]

    Vaswani et al. 2017 'Attention Is All You Need' introduced the transformer architecture that underlies almost all modern LLMs.

    arxiv.org/abs/1706.03762

  2. [02]

    Brown et al. 2020 'Language Models are Few-Shot Learners' (GPT-3 paper) established the scaling-via-prompting paradigm.

    arxiv.org/abs/2005.14165

  3. [03]

    Kaplan et al. 2020 established empirical scaling laws for neural language models.

    arxiv.org/abs/2001.08361

  4. [04]

    Hoffmann et al. 2022 (Chinchilla) showed that prior LLMs were undertrained relative to compute-optimal scaling, revising Kaplan's coefficients.

    arxiv.org/abs/2203.15556

  5. [05]

    Wei et al. 2022 'Emergent Abilities of Large Language Models' is the foundational paper on emergence claims.

    arxiv.org/abs/2206.07682

  6. [06]

    Schaeffer et al. 2023 'Are Emergent Abilities of Large Language Models a Mirage?' (NeurIPS 2023 best paper) showed many emergence claims were metric artifacts.

    arxiv.org/abs/2304.15004

  7. [07]

    Wei et al. 2022 'Chain-of-Thought Prompting Elicits Reasoning in Large Language Models' introduced the canonical CoT prompting technique.

    arxiv.org/abs/2201.11903

  8. [08]

    Yao et al. 2022 'ReAct: Synergizing Reasoning and Acting in Language Models' established the standard reasoning-and-action agent pattern.

    arxiv.org/abs/2210.03629

  9. [09]

    Lewis et al. 2020 introduced retrieval-augmented generation (RAG) for knowledge-intensive NLP tasks.

    arxiv.org/abs/2005.11401

  10. [10]

    Hu et al. 2021 'LoRA: Low-Rank Adaptation of Large Language Models' is the dominant parameter-efficient fine-tuning method.

    arxiv.org/abs/2106.09685

  11. [11]

    Ouyang et al. 2022 (InstructGPT) is the canonical reference for RLHF as deployed in production language models.

    arxiv.org/abs/2203.02155

  12. [12]

    Christiano et al. 2017 introduced the foundational RLHF method using human preference comparisons.

    arxiv.org/abs/1706.03741

  13. [13]

    Bai et al. 2022 'Constitutional AI: Harmlessness from AI Feedback' (Anthropic) introduced the CAI training approach.

    arxiv.org/abs/2212.08073

  14. [14]

    Rafailov et al. 2023 'Direct Preference Optimization' provides a reward-model-free reformulation of RLHF that is now widely adopted.

    arxiv.org/abs/2305.18290

  15. [15]

    Schick et al. 2023 'Toolformer' was the early formulation of language models learning to call external APIs.

    arxiv.org/abs/2302.04761

  16. [16]

    Ji et al. 2023 'Survey of Hallucination in Natural Language Generation' is the canonical hallucination taxonomy reference.

    arxiv.org/abs/2202.03629

  17. [17]

    Turpin et al. 2023 showed that chain-of-thought explanations can be post-hoc rationalizations not reflecting the model's actual reasoning.

    arxiv.org/abs/2305.04388

  18. [18]

    Sainz et al. 2023 documented benchmark contamination in LLM evaluations, motivating held-out testing.

    arxiv.org/abs/2310.16787

  19. [19]

    Morris et al. 2023 (DeepMind) 'Levels of AGI' proposed a generality-by-performance framework to operationalize AGI discussions.

    arxiv.org/abs/2311.02462

  20. [20]

    Bommasani et al. 2021 (Stanford CRFM) introduced the term 'foundation model' for large pretrained models adaptable to many tasks.

    arxiv.org/abs/2108.07258

  21. [21]

    Model Context Protocol is Anthropic's open specification for connecting AI assistants to tools and data sources via standardized servers, launched November 2024.

    modelcontextprotocol.io

  22. [22]

    OpenAI's 2023 'Planning for AGI and beyond' essay defines AGI loosely as AI systems generally smarter than humans, without operational test criteria.

    openai.com/index/planning-for-agi-and-beyond

  23. [23]

    Anthropic's mechanistic interpretability publications including sparse autoencoder work on Claude 3 Sonnet are the primary public reference for circuit-level interpretability research.

    transformer-circuits.pub

  24. [24]

    US Executive Order 14110 made red-teaming and safety reporting requirements for frontier AI developers a regulatory expectation in the US.

    whitehouse.gov · Executive Order 14110 (October 2023)

  25. [25]

    Bostrom's 2014 book formalized the distinction between speed, collective, and quality superintelligence used in subsequent ASI discussions.

    Nick Bostrom · Superintelligence (Oxford University Press, 2014)

LAB · ATOMEONS · MARCO ISLAND FLÆONS RESEARCH · 12 PAPERS · CC-BY 4.0ORANGEBOX v1.0.0-beta · TURBO-OPTIMIZE CLAUDE · SHIPPED 2026-05-30B00KMAKR v3.2.0 · AI PUBLISHING COCKPIT · MAC + WINDOWSFREE LAUNCH WEEK · ENDS JUNE 6 · §4A NO-SAAS LOCKFOUNDER'S VIEW · NEXT BROADCAST IN ...CITE THE WORK · FORWARD THE LINK · NO ALGORITHMLAB · ATOMEONS · MARCO ISLAND FLÆONS RESEARCH · 12 PAPERS · CC-BY 4.0ORANGEBOX v1.0.0-beta · TURBO-OPTIMIZE CLAUDE · SHIPPED 2026-05-30B00KMAKR v3.2.0 · AI PUBLISHING COCKPIT · MAC + WINDOWSFREE LAUNCH WEEK · ENDS JUNE 6 · §4A NO-SAAS LOCKFOUNDER'S VIEW · NEXT BROADCAST IN ...CITE THE WORK · FORWARD THE LINK · NO ALGORITHM