::synthesis · Tim-Ferriss method
Agents · the trapdoor
::minimum effective dose
An agent is an LLM in a loop with tools. The loop: model proposes an action (call a tool, write a file, hit an API), system executes it, result feeds back into the next prompt, repeat until the model says done or a max-steps cap fires. That's it. The trapdoor: agents are dramatically harder to operate than single-shot LLM calls, and the failure modes are non-obvious. (1) Context bloat — each tool call appends output to the next call, so by step 10 your context is 80% tool noise. (2) Recursive errors — a small misstep on step 2 compounds, and by step 8 the agent is confidently solving a problem you didn't ask. (3) Cost explosions — what was a $0.05 single call is a $4 agent run, and you don't notice until the bill. (4) Loops — agents stuck retrying the same failed approach without recognizing the pattern. The honest practitioner stance: agents work GREAT for narrow, well-defined, tool-rich workflows with clear stopping criteria (file refactoring, web research with citations, data extraction across many sources). They work POORLY as 'general-purpose autonomous workers' — the marketing version. The MED: build the smallest possible agent. One model, three tools, hard max-steps cap, human-in-the-loop checkpoint at each major decision, full conversation transcript logging. Add autonomy only after you've proven the bounded version works. Most 'agent failures' are failures to bound the agent.
::DiSSS · deconstruction questions
- 01What's the SINGLE clear stopping criterion — and can the agent tell when it's met?
- 02What tools does the agent actually need — and have I removed every tool it doesn't need?
- 03What's the max-step cap, max-token cap, max-cost cap, and what happens when each fires?
- 04Where are the human-in-the-loop checkpoints, and are they enforced or skippable?
- 05What does the full transcript look like — can I debug a failure post-hoc, or is it a black box?
::fear-setting
Cost of not learning this: you'll deploy something marketed as 'agent' and discover it racked up a $400 bill solving the wrong problem, or worse, took destructive actions (sent emails, modified files, made purchases) you didn't intend. Cost of getting it wrong: this is the single most dangerous category in practical AI right now. Agents with write access, network access, or financial access can cause real harm — accidentally delete production data, send wrong emails to your customer list, file wrong tickets, post wrong updates. The marketing pitches autonomy; the engineering reality requires paranoia. Every operator who has shipped an agent has at least one story of 'it did something I didn't expect.' The ones who haven't ruined something yet have safety rails. The ones who don't have safety rails are one prompt away from a story they don't want.
::80 / 20 cut
SKIP: agent frameworks with deep abstractions (LangGraph, CrewAI, AutoGen) until you've built a bare loop yourself and felt the failure modes. OBSESS OVER: (1) bounded agents with hard caps on steps, tokens, cost, (2) tool minimization — every tool is an attack surface and a decision space, (3) transcript logging — you can't debug what you can't see. Build the smallest agent that handles your bounded task; resist the framework gravity until you've earned the complexity.
::tribe of mentors · paraphrased stances
Anthropic engineering team
Authors of 'Building Effective Agents' essay (2024), the most operator-grounded agent guidance published
Anthropic's stance: most problems that look like 'agent problems' are better solved by workflows — structured chains of LLM calls with deterministic glue. Reserve full agents for problems where the steps genuinely cannot be planned in advance.
Harrison Chase
Co-founder of LangChain, has shipped more agent infrastructure than almost anyone
Harrison's stance: the gap between an agent demo and a production agent is enormous. Demos work because the happy path is constructed; production breaks because the unhappy paths multiply. Invest accordingly.
Andrew Ng
Founded DeepLearning.AI, taught the most-watched agent course in 2024
Andrew's stance: the four agentic design patterns — reflection, tool use, planning, multi-agent — give large quality gains, but they also multiply cost and latency. Use them where the gain pays for the cost; don't apply them by default.
Simon Willison
Built and dissected dozens of agent demos publicly, honest about failure modes
Willison's stance: agents are the most over-promised, under-delivered category in AI right now. Bounded tool-use loops are real and useful; 'autonomous AI workers' is mostly marketing. Operators should be skeptical and bound everything.
::real-world test · this week
This week: build the simplest possible agent — one model, one tool (web search), one task (research and cite three sources for a question you actually have). Add a hard cap: max 8 steps, max $1 in spend. Log the full transcript. Run it. Read the transcript end to end. Notice where the model wandered, where it duplicated effort, where it almost made a wrong decision. This twenty-minute exercise teaches more about agents than ten hours of framework tutorials.
::action items · ranked
- 01Build one bare-metal agent from scratch (no framework) before touching LangGraph, CrewAI, or AutoGen
- 02Add hard caps to every agent: max-steps, max-cost, max-tokens, max-wall-time — enforced, not advisory
- 03Log full transcripts of every agent run and review the first 10 runs end to end before any unattended execution
- 04Strip tools to the minimum the agent needs; remove every tool that isn't load-bearing for the task
- 05Add human-in-the-loop checkpoints for any irreversible action (send, delete, post, purchase) — no exceptions