L10 · Operator~30 min · free · cc-by 4.0

Local AI · Ollama — privacy, offline, and the limit of free

At Operator level you need an honest opinion about local-only AI. Even if you don't use it daily, you should have run it once.

::TL;DR · the whole lesson in three lines

MOVEAt Operator level you need an honest opinion about local-only AI. Even if you don't use it daily, you should have run it once.
DRILLInstall Ollama, download one small model, run one local chat. Even if you never use it again, you'll have an opinion.
WINYou have a working local AI on your machine.

jump to drill ↓or read the full concept first →

::concept · what's actually happening

Local AI = you download a model and run it on your own laptop or desktop. Nothing leaves your machine. No subscription. No phone-home. The privacy posture is total.

read full concept · 2 more paragraphs →

Tradeoffs: quality is below the top cloud models (Claude / GPT / Gemini), but rapidly closing. Speed depends on your hardware. Setup is one terminal command (Ollama makes this trivial). Hardware: a modern laptop with 16+ GB RAM can run 7B–14B parameter models usefully.

When to use local: confidential drafting, offline travel, journaling, anything you genuinely don't want a third party to see. When NOT to use local: complex reasoning, the latest models, large context windows.

::drill · do the thing

Install Ollama, download one small model, run one local chat. Even if you never use it again, you'll have an opinion.

::L10 drill · copy-paste into any AI chat

(This drill is in your terminal, not in a browser. If "terminal" is new to you, read this lesson and skip the drill — come back to it at Operator level.)

1. Go to ollama.com. Download the installer for your OS.
2. Open Terminal (Mac) / PowerShell (Windows) / Terminal (Linux).
3. Run: ollama pull llama3.2:3b
4. Wait 2–5 minutes for the download (~2 GB).
5. Run: ollama run llama3.2:3b
6. You're now chatting with a local model. Type something. Press enter.
7. Try a real task you'd normally run on Claude / ChatGPT. Notice the speed, the quality gap, the privacy difference.
8. To exit: type /bye and press enter.

(This drill is in your terminal, not in a browser. If "terminal" is new to you, read this lesson and skip the drill — come back to it at Operator level.)

1. Go to ollama.com. Download the installer for your OS.
2. Open Terminal (Mac) / PowerShell (Windows) / Terminal (Linux).
3. Run: ollama pull llama3.2:3b
4. Wait 2–5 minutes for the download (~2 GB).
5. Run: ollama run llama3.2:3b
6. You're now chatting with a local model. Type something. Press enter.
7. Try a real task you'd normally run on Claude / ChatGPT. Notice the speed, the quality gap, the privacy difference.
8. To exit: type /bye and press enter.

::or open one in a new tab — then paste

Claude↗ChatGPT↗Gemini↗

::steps

01Install Ollama from ollama.com.
02Pull llama3.2:3b (small, fast, 2GB download).
03Run one real task locally.
04Open Claude or ChatGPT in another tab. Run the same task there.
05Note the differences: quality, speed, privacy.

::outcome · what should be true

You have a working local AI on your machine.
You have a calibrated opinion on the cloud-vs-local tradeoff for YOUR work.
You have run one task that you'd never have run on cloud AI (something sensitive).

::trap · the most common failure

Trying to make local AI your primary tool when you don't need that level of privacy. Cloud AI is better for most tasks. Local is a tool for specific situations, not a religion.

::next lesson →

L11 · Model routing — switching between Claude, GPT, Gemini mid-task

Operators don't pick one AI. They route each task to the model that does it best. Knowing the strengths is the skill.

~25 min · open →

::other lessons at Operator level

L11~25 min

← back to /learn full lesson library →

Local AI · Ollama — privacy, offline, and the limit of free

L11 · Model routing — switching between Claude, GPT, Gemini mid-task

Model routing — switching between Claude, GPT, Gemini mid-task

MCP servers — the plug socket that turned AI into a real tool

Agent mode — when AI takes action, not just answers

Computer use — when AI takes the mouse and keyboard

What AI cannot replace — taste, judgment, relationships

Agents 101: model plus tools plus loop

MCP: structured tools for AI

Skill primers: teach a session your context in 30 seconds

Local models with Ollama

Vision models: when to use them

Audio and Whisper transcription

RAG vs long context: when to retrieve, when to dump

Embeddings: meaning as numbers

Fine-tuning vs prompt engineering

AI safety in personal use

Multimodal prompting: combining text, image, audio

Chain-of-thought: making the model show its work

Tool use and structured output

Cost optimization: tokens, caching, model selection