
::calculator · When does buying local hardware beat paying per-token to the cloud?
Local vs Cloud Break-Even
::inputs
Your last month's bill from Anthropic, OpenAI, Google, etc. combined.
Used 4090 ~$1,800. New 5090 ~$2,800. Dual 3090 build ~$2,400.
U.S. residential avg ~16¢. Check your utility bill — varies 9¢ (WA) to 35¢ (HI/CA).
4090 ~350W, 5090 ~450W, 3090 ~350W. Idle is much lower — this assumes sustained inference.
Be honest. Most solo builders inference 2-8 hrs/day; agent-loops can run 24/7.
::result
Monthly local electricity cost
$7.56
Months to break even
4.1
Net savings over 3 years
$15,728
::how this calculates
Monthly electricity is GPU watts times hours per day times 30 days, converted from watt-hours to kilowatt-hours, then multiplied by your rate. Months to break even is hardware cost divided by the monthly savings (cloud bill minus local electricity). Three-year savings is 36 months of that savings minus the hardware cost. If your local electricity already exceeds your cloud bill, the calculator returns a negative break-even, meaning local never pays off at this usage level.
::worked examples
Solo dev on Claude 3.5 Sonnet, mid-tier 4090 rig
$500/mo cloud, $2,000 used 4090, 12¢ electricity, 6hrs/day inference. Monthly electricity ~$7.56, savings ~$492/mo, breaks even in ~4 months, saves ~$15,720 over 3 years.
Heavy agent-loop user on dual GPT-4o pipelines
$2,500/mo across multiple agents, dual-3090 server build, U.S. avg rate, near-24/7 use. Electricity climbs to ~$60/mo but cloud savings of $2,440/mo break even in ~2 months. Three-year net: ~$83,000.
Light user, expensive California electricity
Only $75/mo cloud spend, 30¢/kWh PG&E rates, 4hrs/day. Electricity is ~$12.60/mo, savings ~$62/mo, break-even ~40 months. Local hardware does not pay off before the GPU is obsolete. Stay cloud.
RAG-pipeline shop running Llama 3.3 70B locally
$1,200/mo previous cloud bill, new 5090, cheap industrial rate, 12hrs/day. Electricity ~$17.82/mo, monthly savings ~$1,182, break-even ~2.5 months, 3-year net savings ~$39,500.
::what this does NOT capture
- ○GPU runs at stated wattage for the full hours-per-day window. Real workloads cycle between idle (~30W) and full load, so actual electricity is usually 30-50% lower than this estimate.
- ○Local model quality is assumed adequate for the workload. Frontier reasoning tasks (Claude 3.5 Sonnet, GPT-4o, o1) still beat open-weight 70B models on hard benchmarks — if you need that ceiling, the cloud cost is not optional.
- ○Hardware cost is treated as sunk at month zero. No depreciation, no resale value, no warranty replacement, no PSU/case/cooling/RAM accounted for separately — assume the GPU price includes the marginal upgrade to an existing rig.
- ○Electricity rate is flat. Time-of-use plans, demand charges, and tiered residential pricing can shift the real number ±30%.
- ○No cost is assigned to operator time: setup, model downloads, quantization tuning, driver issues, OS reinstalls, occasional debugging. Budget 10-40 hours of one-time setup at your hourly rate.
- ○Cooling and AC load from heat dumped into the room is not modeled. A 350W GPU in a hot climate can add 10-25% to total electricity through extra AC runtime in summer.
- ○Cloud price assumed stable. Provider rate cuts (which happen 1-2x/year) shorten break-even on cloud and lengthen it locally. Hardware prices also drop; today's $2,000 4090 may be $1,200 in 18 months.
- ○Networking, internet bandwidth, and the cost of running the rest of your workstation are excluded — only the incremental GPU electricity is counted.