L45 · Pilot~30 min · free · cc-by 4.0

Open weights vs closed weights

When the model file is on your machine, the rules change · know what you gain, what you give up, and what stays the same.

::TL;DR · the whole lesson in three lines

MOVEWhen the model file is on your machine, the rules change · know what you gain, what you give up, and what stays the same.
DRILLYou will pick one of your current workflows, project where it would land on the open-vs-closed split, and run it both ways for honest comparison.
WINYou ran the same task on both an open-weight and closed-weight model.

jump to drill ↓or read the full concept first →

::concept · what's actually happening

Closed-weight models (Claude, GPT, Gemini) live on the provider's servers · you send a request, they send a response, the weights never leave their machines. You pay per token, you trust their privacy policy, you depend on their uptime, and you get the model they decide to run today.

read full concept · 4 more paragraphs →

Open-weight models (Llama, Qwen, Mistral, DeepSeek, others) ship the actual weight files publicly · you download them, you run them on your hardware, the model belongs to your machine in the same way a PDF does. Capability lags the frontier by 6-18 months typically, but the gap is narrowing.

The structural wins of open weights are sovereignty (the model cannot be silently changed or deprecated under you), privacy (the data never leaves your machine), and zero marginal cost after hardware is paid for (you are not paying per token, you are paying per electron). For some workloads these win decisively.

The structural costs are real · you maintain the inference stack, you debug your own GPU memory issues, you do not get model upgrades for free, and you carry the responsibility for what the model does on your hardware. The hidden engineering tax is the part that surprises operators.

The practical 2026 split looks like this · closed weights for frontier reasoning and agentic work where capability matters and privacy is acceptable. Open weights for high-volume routine tasks, privacy-critical work, and offline operation. Most serious operators run both.

::drill · do the thing

You will pick one of your current workflows, project where it would land on the open-vs-closed split, and run it both ways for honest comparison.

::L45 drill · copy-paste into any AI chat

I want to honestly evaluate one of my current AI workflows against open-weight alternatives. The workflow: [DESCRIBE · e.g. 'I use Claude Sonnet to draft replies to customer support emails, ~40 per day']. Walk me through: 1) what specific open-weight model would I try as the closest substitute (name a version that runs reasonably on [YOUR HARDWARE])? 2) where will the capability gap likely show up · which kinds of inputs will the open model handle worse? 3) what does the privacy / cost / latency comparison look like in real numbers for my volume? 4) is there a hybrid · open-weight handles the easy 80%, closed-weight escalates the hard 20%? 5) realistic verdict: should I make this switch, run a pilot, or stay where I am? Do not flatter open weights if they do not win for my use case.

I want to honestly evaluate one of my current AI workflows against open-weight alternatives. The workflow: [DESCRIBE · e.g. 'I use Claude Sonnet to draft replies to customer support emails, ~40 per day']. Walk me through: 1) what specific open-weight model would I try as the closest substitute (name a version that runs reasonably on [YOUR HARDWARE])? 2) where will the capability gap likely show up · which kinds of inputs will the open model handle worse? 3) what does the privacy / cost / latency comparison look like in real numbers for my volume? 4) is there a hybrid · open-weight handles the easy 80%, closed-weight escalates the hard 20%? 5) realistic verdict: should I make this switch, run a pilot, or stay where I am? Do not flatter open weights if they do not win for my use case.

::or open one in a new tab — then paste

Claude↗ChatGPT↗Gemini↗

::steps

01Pick one workflow with enough volume that the choice matters.
02Run the prompt and get the open-weight candidate model name.
03Pull that model via Ollama (or your local stack of choice).
04Run the same 5 real inputs on both your closed and the open model.
05Note where they agree, where they disagree, and where one fails outright.
06Decide: full switch, hybrid escalation, or stay closed.

::outcome · what should be true

You ran the same task on both an open-weight and closed-weight model.
You can articulate the capability gap on YOUR data, not in abstract.
You decided open / closed / hybrid for that workflow with evidence.
You understand the engineering tax you would carry if you went local-only.

::trap · the most common failure

Operators read about open-weight progress and assume the gap is closed · then run a 7B model on a reasoning task and watch it confidently produce wrong answers. The gap exists, it just lives in specific places. Test on your data, not on benchmarks.

::end of the curriculum

You're at Pilot level. There's no Level 6.

The next move is doing the work, not another lesson. If you want operator-grade infrastructure, that's /orangebox. If you want the lab's working journal, /founders-view. If you want to collaborate on the curriculum itself, the source is public on GitHub.

::other lessons at Pilot level

L12~20 min

← back to /learn full lesson library →

Open weights vs closed weights

You're at Pilot level. There's no Level 6.

Outgrowing the chat box — when chat isn't the right surface anymore

Receipts and paper trail — audit your own AI use

AI for kids and teachers — the next-generation curriculum

The senior-engineer pattern — talk to AI like a senior

Long-context strategy: when 200K is right, when chunking wins

AI receipts: building your own audit trail

Voice cloning: ethics and practical workflows