::calculator · Per-call CO2 estimate by model size and datacenter region
AI Carbon Calculator
::inputs
Active parameters per forward pass. For MoE models, use the active count (~37B for Mixtral 8x22B), not total.
Total monthly API requests across your application.
Average grid carbon intensity for the region serving your inference.
Watt-hours per inference call per billion active parameters. 0.5 is the 2026 industry rough estimate; lower for quantized models, higher for long-context.
::result
Total energy (monthly)
350,000.0 g CO₂
Total CO2 emissions (monthly)
0.0 g CO₂
CO2 per single inference call
0.000
Equivalent car miles driven
0.0
::how this calculates
Total energy is the product of model size (in billions of active parameters), the energy intensity factor (default 0.5 Wh per call per billion params), and monthly call volume. That gives total watt-hours per month. We then convert to kilowatt-hours and multiply by the region's grid carbon intensity (g CO2 per kWh) to get monthly emissions in grams. The car-mile equivalent uses the EPA 2024 average of 400 g CO2 per mile driven (passenger vehicle, US fleet average).
::worked examples
Small startup on GPT-4o, US East
A 200B-param model (rough GPT-4o estimate) at 50K calls/month on us-east burns ~5,000 kWh/month and emits ~1,900 kg CO2 — equivalent to ~4,750 miles of driving. About 1.5 average US households' monthly electricity use.
Claude 3.5 Sonnet workload, EU West
175B active params (estimated for Claude 3.5 Sonnet) at 100K calls/month routed to eu-west (France-Nordic grid) emits ~1,925 kg CO2 — roughly 40% less than the same workload on us-east, despite double the call volume of example one.
Mixtral 8x22B MoE, US West
Mixtral 8x22B activates only ~39B of its 176B total params per call. At 200K calls/month on us-west's clean grid, emissions are ~1,092 kg CO2 — showing how MoE architectures plus regional choice compound benefits.
Heavy enterprise on Gemini 1.5 Pro, Asia Pacific
A worst-case scenario: 540B-param model, 500K calls/month, asia-pacific coal-heavy grid. Emits ~72,900 kg CO2/month — equivalent to ~182,250 miles of driving, or about 16 average US passenger vehicles for a full year.
::what this does NOT capture
- ○Energy intensity of 0.5 Wh per call per billion active parameters is a rough industry estimate based on Hugging Face energy benchmarks and Patterson et al. (Google, 2021–2024). Your actual deployment may vary 2–10x depending on batch size, context length, quantization, and GPU generation.
- ○We use ACTIVE parameters, not total. For Mixture-of-Experts models (Mixtral, DeepSeek V3, GPT-4-class), the active count is what burns energy per call — not the total parameter count.
- ○Grid carbon intensities are 2026 annual averages and do NOT reflect time-of-day variation. A call at 2am on us-west may be 50% cleaner than at 6pm peak; some providers (Google, Microsoft) route to clean grids in real time.
- ○Training emissions are excluded. A frontier model's training run may emit thousands of tonnes of CO2 once, then amortize across billions of inferences — your per-call number does not include this.
- ○PUE (power usage effectiveness) is implicitly baked into the 0.5 Wh figure at ~1.2x overhead. Hyperscaler datacenters in 2026 run PUE 1.08–1.15; older colos run 1.4+.
- ○Network transit, client-side compute, and embedded GPU manufacturing carbon are excluded. Full lifecycle analysis (LCA) would add 5–20% on top of operational emissions.
- ○Car-mile equivalents use EPA 2024 US passenger vehicle average of 400 g CO2 per mile. EU fleet average is lower (~280 g/mi equivalent); EVs and freight differ substantially.
- ○Price ($/M tokens) does NOT correlate linearly with energy — a $3/M Claude call may burn similar joules to a $0.25/M Haiku call if the model size is identical. Use this calculator for energy/carbon, not cost.