AtomEons / Learn / calc / tools / prompt-diff

::calculator · Two prompts walk into a bar. One of them is 400 characters longer. Find out what that costs.

Prompt Diff Estimator

Every prompt you ship has a length, and every length has a price. Most teams ignore this until the monthly invoice arrives, then scramble to figure out which system prompt grew, which retrieval window expanded, which "small" preamble tweak added forty cents per call. This calculator does the boring arithmetic so you can stop guessing. Paste in character counts for two prompt variants — the current production prompt and the rewrite you're considering, or version 1 and version 2 of a system message, or whatever A/B pair is in front of you — set your per-million input token price, and tell it how many calls per day this prompt actually runs. It returns the token delta, the daily cost delta, and the annual delta projected at today's call volume. The conversion is the standard rule of thumb: roughly 4 characters per token for English text. That ratio is not law. Code, JSON, non-Latin scripts, and tightly-tokenized strings drift outside it. Treat the output as a first-order estimate, not a finance-grade forecast. If you need exactness, run the actual tokenizer for your model and replace the 4× assumption. Prices reflect the June 2026 frontier-model market. Claude 3.5 Sonnet sits at $3 per million input tokens. GPT-4o is $2.50. Gemini 1.5 Pro ranges $1.25 to $5 depending on context length. The default of $3 reflects the median spot most teams pay for a working production model in mid-2026; override it with your real number. This is an input-cost calculator only. Output tokens, cache hits, batch discounts, fine-tune surcharges, and committed-use credits are all out of scope. The annual figure assumes today's call volume holds for 365 days, which it almost certainly won't — usage tends to grow as the product grows. Use the delta as a relative signal, not an absolute budget line. Lab-grade rule: if the daily delta is under a coffee, ship the longer prompt if it's measurably better. If the daily delta buys a car each year, the rewrite needs to earn it.

::inputs

Prompt A lengthchars

Character count of the first prompt variant (e.g. current production).

Prompt B lengthchars

Character count of the second prompt variant (e.g. proposed rewrite).

Price per 1M input tokens$

June 2026: Claude 3.5 Sonnet $3, GPT-4o $2.50, Gemini 1.5 Pro $1.25-$5.

Calls per daycalls/day

How many times per day this prompt actually runs in production.

::result

Token delta per call (B minus A)

100 tokens

Daily cost delta

$0.03

Annual cost delta

$10.95

::how this calculates

Each prompt is converted to tokens using the 4-characters-per-token rule of thumb (tokens = chars / 4). The token delta is simply tokensB minus tokensA. Daily cost for each variant is (tokens × calls per day ÷ 1,000,000) × price per million. The daily delta is the difference between the two daily costs. The annual delta multiplies the daily delta by 365 to project full-year exposure at today's call volume.

::worked examples

Small system-prompt rewrite, low volume

charsA: 2000charsB: 2400pricePerM: 3callsPerDay: 100

Adding 400 characters (~100 tokens) to a system prompt running 100 times a day at Claude 3.5 Sonnet pricing costs about three cents a day, or roughly eleven dollars a year. Ship the rewrite if it's any good.

Retrieval window expansion at production scale

charsA: 4000charsB: 12000pricePerM: 3callsPerDay: 10000

Bolting an 8,000-character retrieval block onto a prompt firing 10,000 times a day adds about 2,000 tokens per call. That's $60/day, or roughly $21,900/year. Cache it, compress it, or earn it.

Switching to a cheaper model

charsA: 3000charsB: 3000pricePerM: 1.25callsPerDay: 5000

Same prompt length, but priced at Gemini 1.5 Pro's lower input tier. The delta versus itself is zero — use this view to compare against a separate $3/M run to see the model-swap savings explicitly.

::what this does NOT capture

○The 4-characters-per-token ratio is a rough English-language average. Code, JSON, non-Latin scripts, and dense identifiers tokenize denser; run your actual tokenizer if precision matters.
○Input tokens only. Output tokens, cache reads, batch-API discounts, and fine-tune surcharges are not included — output tokens at $10-$15/M can dwarf prompt cost.
○Annual projection holds calls-per-day constant at 365 days. Real usage tends to grow with the product, so treat the annual delta as a floor.
○Prices reflect the June 2026 market and do not account for negotiated enterprise rates, committed-use credits, or volume discounts.
○No prompt caching benefit modeled. Anthropic and OpenAI both offer cache hits at ~10% of normal input price; if your prompt is cacheable, divide the delta accordingly.
○Assumes both prompts always fire in full on every call. Conditional logic, early termination, and tool-use branching can reduce real-world exposure.
○No retry / failure overhead. Production systems retry 0.5-5% of calls; add that as a multiplier if it matters for your accounting.
○System prompts and user prompts are billed identically here. If you're comparing across different message roles in a structured chat, the rate is the same but the cache behavior differs.

← back to /learn·/tools index →