LLM Cost Calculator

Estimate the cost of any LLM API call across 26 models from 7 providers. Pricing verified against official docs on 2026-05-27.

Input tokens per request

Or paste text below to estimate from length.

Paste prompt text (optional)

Estimate: 0 tokens (chars / 4 heuristic)

Output tokens per request

Requests per month

10k = small SaaS; 1M = scale; 100M = enterprise

Input cached %

0% of input tokens served from cache (when supported)

Monthly cost (cheapest model): —

Monthly cost (most expensive): —

—

Cost per month, all models

Sort:

Model	Provider	$ / 1M in	$ / 1M out	Context	Monthly cost
GPT-5.5 Pro context_window/max_output unverified on pricing page	OpenAI	$30.00	$180.00	—	—
GPT-5.5	OpenAI	$5.00	$30.00	1,050k	—
GPT-5.4 Inputs >272k tokens incur 2x input + 1.5x output for the full session	OpenAI	$2.50	$15.00	1,050k	—
GPT-5.4 Mini	OpenAI	$0.75	$4.50	—	—
GPT-5.4 Nano	OpenAI	$0.20	$1.25	—	—
Claude Opus 4.1	Anthropic	$15.00	$75.00	200k	—
Claude Opus 4.7 1M context at standard pricing (no >200k surcharge). New tokenizer uses up to ~35% more tokens for the same English text.	Anthropic	$5.00	$25.00	1,000k	—
Claude Opus 4.6	Anthropic	$5.00	$25.00	1,000k	—
Claude Opus 4.5	Anthropic	$5.00	$25.00	200k	—
Claude Sonnet 4.6 1M context at standard pricing (no >200k surcharge)	Anthropic	$3.00	$15.00	1,000k	—
Claude Sonnet 4.5	Anthropic	$3.00	$15.00	200k	—
Claude Haiku 4.5	Anthropic	$1.00	$5.00	200k	—
Gemini 3.1 Pro (Preview) Tiered: prompts >200k tokens use 2x rates	Google	$2.00	$12.00	—	—
Gemini 3.5 Flash	Google	$1.50	$9.00	—	—
Gemini 2.5 Pro Tiered: prompts >200k tokens use 2x rates	Google	$1.25	$10.00	—	—
Gemini 2.5 Flash	Google	$0.30	$2.50	—	—
Gemini 3.1 Flash-Lite	Google	$0.25	$1.50	—	—
DeepSeek V4 Pro 75% promotional discount until 2026-05-31 15:59 UTC; post-promo: $1.74 in / $3.48 out per 1M	DeepSeek	$0.43	$0.87	1,000k	—
DeepSeek V4 Flash	DeepSeek	$0.14	$0.28	1,000k	—
Grok 4.3	xAI	$1.25	$2.50	1,000k	—
Grok 4.20 Reasoning	xAI	$1.25	$2.50	1,000k	—
Grok Build 0.1	xAI	$1.00	$2.00	256k	—
Llama 3.3 70B (Groq)	Groq (Llama)	$0.59	$0.79	128k	—
Llama 4 Scout (Groq)	Groq (Llama)	$0.11	$0.34	128k	—
Llama 3.1 8B Instant (Groq)	Groq (Llama)	$0.05	$0.08	128k	—
Llama 3.3 70B (Together AI)	Together AI (Llama)	$0.88	$0.88	—	—

How this calculator works

We multiply input tokens × requests × input price per 1M and add output tokens × requests × output price per 1M. When you set a cache hit rate, the cached portion of input tokens is billed at the model's cached price (much lower for OpenAI, Anthropic, Google, DeepSeek — not available for Groq, Together, xAI).

Token counts pasted into the textarea use a chars/4 estimate — fast and good enough to compare models. A v2 of this tool will run the real tokenizer per provider (OpenAI tiktoken, Anthropic tokenizer, Google Gemini tokenizer) for exact counts.

What's NOT included in this calculation

Batch API discounts (50% off for OpenAI, Anthropic, Google) — use them for non-realtime workloads.
Anthropic cache write costs (1.25× for 5min cache, 2× for 1h cache) — only the cache read savings are modeled here.
Gemini >200k tier surcharge (2× rates) on `gemini-2.5-pro` and `gemini-3.1-pro-preview`. This MVP applies the base rate; the v2 calculator will branch on prompt size.
OpenAI GPT-5.4 session surcharge >272k tokens (2× input + 1.5× output for the entire session).
Fast mode (Anthropic Opus 4.6/4.7): 6× multiplier when enabled.
Data residency uplift (1.1× for region-locked endpoints).

Pricing freshness

Last verified: 2026-05-27. Sources: each row's official provider docs page. The DeepSeek V4 Pro line currently shows a 75% promo price ending 2026-05-31 15:59 UTC — after that, multiply the displayed v4-pro rates by 4×.

See also our deep-dives: tool comparisons, AI coding tools, AI productivity.