The verify rule — three categories of trust
Not everything AI says needs verification. Most things don't. Knowing which third does is the skill.
::TL;DR · the whole lesson in three lines
- MOVENot everything AI says needs verification. Most things don't. Knowing which third does is the skill.
- DRILLCategorize your last 10 AI uses into the three buckets. This is the meta-skill: knowing which mode you're in.
- WINYou have a calibrated mental model of when to verify.
::concept · what's actually happening
Trust AI for: structure, format, drafting, brainstorming, summarizing what you wrote, rewriting your tone, translating, generating options to choose between. These are pattern tasks where being plausible IS being right.
read full concept · 2 more paragraphs →collapse concept ↑
Verify AI on: numbers, names, dates, citations, statistics, legal claims, medical claims, financial claims, anything you'll quote, anything that will go in front of a customer / boss / regulator. These are fact tasks where being plausible can be wrong.
Never trust AI for: passwords, account numbers, anything you can't undo if AI is wrong, decisions where the stakes are higher than the time you saved.
::drill · do the thing
Categorize your last 10 AI uses into the three buckets. This is the meta-skill: knowing which mode you're in.
::L5 drill · copy-paste into any AI chat
(This drill is on paper or a Note, not in the AI chat.) List your last 10 AI uses (best estimate, doesn't have to be perfect). Categorize each into one of three buckets: 1. TRUST — pattern task (drafting, structure, rewriting). I should not verify; the time spent verifying is more than the risk of being wrong. 2. VERIFY — fact task (numbers, names, citations, claims). I should check before using. 3. DON'T USE AI FOR THIS — decision is too high-stakes for AI alone. For each VERIFY item, write the one specific check you should have done. For each DON'T USE item, name what you should do instead.
::steps
- 01List 10 recent AI uses (or fewer if you don't have 10 yet — minimum 5).
- 02Tag each TRUST / VERIFY / DON'T USE.
- 03For VERIFY items: did you actually check?
- 04For DON'T USE items: did AI subtly enter the decision anyway?
::worked example · what one real run looks like
See one real run before you try yours.
::worked example · what one real run looks like
See one real run before you try yours.
::what a real human filled in
My last 10 AI uses, roughly in order: 1. Asked ChatGPT to summarize a 40-page PDF for a client meeting in 20 min 2. Used Claude to draft a cold email to a VP at Anthropic 3. Asked Gemini what year Stripe was founded (told my coworker 2009) 4. Used Cursor to write a Postgres query for monthly recurring revenue 5. Asked ChatGPT for "the best pre-workout supplement for someone over 35" 6. Used Claude to rewrite my LinkedIn bio 7. Asked Perplexity what California's current capital gains tax rate is 8. Used ChatGPT to translate a Spanish text message from a contractor 9. Asked Claude to explain a clause in my apartment lease 10. Used Midjourney to generate a hero image for my landing page
::what the AI returned
Good list. Real uses, not hypotheticals. Here's the split: **TRUST (used it raw, fine to keep doing that):** - #2 cold email draft — voice/style work, you're the final reader, low blast radius - #6 LinkedIn bio rewrite — same category, you read it before posting - #8 Spanish translation — short, conversational text; you can sanity-check the vibe even without fluent Spanish. (If it were a legal notice from the contractor, this moves to VERIFY.) - #10 Midjourney hero image — aesthetic output, you decide if it looks right with your own eyes **VERIFY (you should have checked before acting, may not have):** - #1 PDF summary for client meeting — this is the dangerous one. 40 pages compressed to a summary in 20 min means you walked into a client meeting trusting a model's pick of what mattered. At minimum: skim the exec summary + any section the AI flagged + anything with numbers. Going forward, paste the summary back and ask "what did you leave out that a sharp client might ask about?" - #4 Postgres MRR query — SQL runs. Run it on a known slice and check the number against something you already know (last month's total, a single customer). MRR queries famously double-count upgrades or miss pauses. - #7 California cap gains rate — Perplexity is closer to OK here because it cites, but you still click the cite. Tax rates move and models confidently quote old ones. - #9 apartment lease clause — Claude is decent at this but "explain" can drift into "advise," and lease language is jurisdiction-specific. Cross-check with the actual statute or a 10-min call to a tenant hotline if the clause matters to money. **DON'T USE (wrong tool for the job):** - #3 Stripe founding year told to coworker as fact — you stated a year you didn't verify. Stripe was founded in 2010, not 2009. This is the textbook bad pattern: low-stakes-feeling fact, zero friction to check, you became the source of a wrong claim. One Wikipedia tab would have caught it. - #5 "best pre-workout for someone over 35" — health + supplements + personalized recommendation = three failure modes stacked. Models will confidently name brands and dosages without knowing your meds, BP, caffeine tolerance, or what "best" even means to you. Ask a pharmacist or your doctor; use AI to prep questions, not to pick the product. **The pattern in your 10:** You trust AI correctly when you're the final filter (writing, images, casual translation). You over-trust it when speed pressure is on (#1, #3) and when the topic feels small but the consequence is real (#3, #5). The verify rule isn't "check everything" — it's "check when a wrong answer costs more than the 60 seconds it takes to check." One concrete change for next week: before quoting any AI-produced fact to another human, open one tab and confirm. That single habit kills most of the bad uses on this list.
::what to notice
- The AI opens with 'Good list. Real uses, not hypotheticals.' — small validation that doesn't oversell, then gets straight to the categorization without preamble.
- Notice how #8 (Spanish translation) gets a conditional caveat in parentheses: 'If it were a legal notice from the contractor, this moves to VERIFY.' Real categorization isn't rigid — context shifts the bucket.
- The Stripe year correction (#3) is specific and falsifiable: 'Stripe was founded in 2010, not 2009.' The AI committed to a number rather than hedging. That's the kind of grounded correction the drill is supposed to produce.
- The closing pattern analysis ('You trust AI correctly when you're the final filter... You over-trust it when speed pressure is on') is the load-bearing insight — it turns 10 data points into a behavioral rule the user can actually carry forward.
- Small imperfection: the line 'MRR queries famously double-count upgrades or miss pauses' is a slightly out-of-place flex — true and useful, but jammed into a bullet that was already making its point. Reads like a real AI that couldn't resist adding the extra detail.
::outcome · what should be true
- You have a calibrated mental model of when to verify.
- You have at least one VERIFY item you wish you had checked.
- You stop verifying drafts (waste of time) and start verifying numbers (saves your job).
::trap · the most common failure
Verifying everything. That's the same mistake as verifying nothing — both make you slow without making you right. The skill is the category.
::other lessons at Learner level
Refine, don't restart — the second draft is where it lands
The biggest skill jump at this level: stop deleting the conversation and starting over when an answer is wrong. Refine in-place.
Your saved-prompt library — the second-biggest leverage
The first time you write a good prompt for a recurring task, save it. The second time, you reuse it. By month two, your prompt library is doing 60% of the work.
Refusal posture — knowing what your AI won't say
Every AI refuses different things in different ways. Map the refusal shape of the tool you actually use, instead of guessing or repeating internet rumors.
Few-shot — teach by example
Three good examples will outperform a one-paragraph instruction every time. The skill is curating the examples.
::part of the AtomEons /learn curriculum · 45 lessons · 5 levels · cc-by 4.0