Local AI · Ollama — privacy, offline, and the limit of free
At Operator level you need an honest opinion about local-only AI. Even if you don't use it daily, you should have run it once.
::TL;DR · the whole lesson in three lines
- MOVEAt Operator level you need an honest opinion about local-only AI. Even if you don't use it daily, you should have run it once.
- DRILLInstall Ollama, download one small model, run one local chat. Even if you never use it again, you'll have an opinion.
- WINYou have a working local AI on your machine.
::concept · what's actually happening
Local AI = you download a model and run it on your own laptop or desktop. Nothing leaves your machine. No subscription. No phone-home. The privacy posture is total.
read full concept · 2 more paragraphs →collapse concept ↑
Tradeoffs: quality is below the top cloud models (Claude / GPT / Gemini), but rapidly closing. Speed depends on your hardware. Setup is one terminal command (Ollama makes this trivial). Hardware: a modern laptop with 16+ GB RAM can run 7B–14B parameter models usefully.
When to use local: confidential drafting, offline travel, journaling, anything you genuinely don't want a third party to see. When NOT to use local: complex reasoning, the latest models, large context windows.
::drill · do the thing
Install Ollama, download one small model, run one local chat. Even if you never use it again, you'll have an opinion.
::L10 drill · copy-paste into any AI chat
(This drill is in your terminal, not in a browser. If "terminal" is new to you, read this lesson and skip the drill — come back to it at Operator level.) 1. Go to ollama.com. Download the installer for your OS. 2. Open Terminal (Mac) / PowerShell (Windows) / Terminal (Linux). 3. Run: ollama pull llama3.2:3b 4. Wait 2–5 minutes for the download (~2 GB). 5. Run: ollama run llama3.2:3b 6. You're now chatting with a local model. Type something. Press enter. 7. Try a real task you'd normally run on Claude / ChatGPT. Notice the speed, the quality gap, the privacy difference. 8. To exit: type /bye and press enter.
::steps
- 01Install Ollama from ollama.com.
- 02Pull llama3.2:3b (small, fast, 2GB download).
- 03Run one real task locally.
- 04Open Claude or ChatGPT in another tab. Run the same task there.
- 05Note the differences: quality, speed, privacy.
::outcome · what should be true
- You have a working local AI on your machine.
- You have a calibrated opinion on the cloud-vs-local tradeoff for YOUR work.
- You have run one task that you'd never have run on cloud AI (something sensitive).
::trap · the most common failure
Trying to make local AI your primary tool when you don't need that level of privacy. Cloud AI is better for most tasks. Local is a tool for specific situations, not a religion.
::other lessons at Operator level
Model routing — switching between Claude, GPT, Gemini mid-task
Operators don't pick one AI. They route each task to the model that does it best. Knowing the strengths is the skill.
MCP servers — the plug socket that turned AI into a real tool
Model Context Protocol is the standard plug. Knowing what plugs in changes what your AI can actually touch — your files, your inbox, your calendar, your repos.
Agent mode — when AI takes action, not just answers
The frontier of useful AI is agents that DO things — browse, click, file, send. The actual skill is the safety pattern, not the magic.
Computer use — when AI takes the mouse and keyboard
Claude in Chrome, ChatGPT Atlas, computer-use beta — the frontier is AI that drives your browser like a human. Knowing the safety pattern is the actual skill.
What AI cannot replace — taste, judgment, relationships
The operators winning in 2026 are the ones who learned what AI is for and what is theirs. Knowing the line is more valuable than any prompt.
Agents 101: model plus tools plus loop
An agent is a model with tools running in a loop until done · know when you need one and when you don't.
MCP: structured tools for AI
Model Context Protocol is the USB-C of AI tooling · learn the shape before you wire anything.
Skill primers: teach a session your context in 30 seconds
A skill is a reusable file that primes a fresh AI session with your project, voice, and rules · stop re-explaining yourself.
Local models with Ollama
Run Llama, Qwen, or Mistral on your own laptop · no API, no logs, no monthly bill for the work that should stay home.
Vision models: when to use them
Vision lets the model see images · powerful for screenshots and diagrams · weak for precise spatial work · know the line.
Audio and Whisper transcription
Whisper turns audio into text · meetings, voice memos, interviews · the AI-era replacement for note-taking.
RAG vs long context: when to retrieve, when to dump
RAG fetches the right slice of your data at query time · long context stuffs everything in · know which problem you actually have.
Embeddings: meaning as numbers
An embedding is a list of numbers that captures the meaning of text · learn the shape and you unlock semantic search, deduplication, and clustering.
Fine-tuning vs prompt engineering
For individuals, fine-tuning is almost never worth it · know exactly when it actually is.
AI safety in personal use
PII, NDAs, financial data, and other people's secrets · know the rules of what you do not paste.
Multimodal prompting: combining text, image, audio
The strongest prompts use the medium that fits the question · sometimes you describe, sometimes you show, sometimes you do both.
Chain-of-thought: making the model show its work
Asking the model to reason step-by-step before answering raises accuracy on hard problems · know when it earns its cost.
Tool use and structured output
Function calling makes the model return JSON your code can use · know the contract before you build on it.
Cost optimization: tokens, caching, model selection
AI is metered · the operators who stay profitable measure what they spend and choose the model that fits the task.
::part of the AtomEons /learn curriculum · 45 lessons · 5 levels · cc-by 4.0