LLM Cost Calculator

Estimate the cost of any LLM API call across 26 models from 7 providers. Pricing verified against official docs on 2026-05-27.

Or paste text below to estimate from length.
Estimate: 0 tokens (chars / 4 heuristic)
10k = small SaaS; 1M = scale; 100M = enterprise
0% of input tokens served from cache (when supported)
Monthly cost (cheapest model):
Monthly cost (most expensive):

Cost per month, all models

Model Provider $ / 1M in $ / 1M out Context Monthly cost
GPT-5.5 Pro
context_window/max_output unverified on pricing page
OpenAI $30.00 $180.00
GPT-5.5 OpenAI $5.00 $30.00 1,050k
GPT-5.4
Inputs >272k tokens incur 2x input + 1.5x output for the full session
OpenAI $2.50 $15.00 1,050k
GPT-5.4 Mini OpenAI $0.75 $4.50
GPT-5.4 Nano OpenAI $0.20 $1.25
Claude Opus 4.1 Anthropic $15.00 $75.00 200k
Claude Opus 4.7
1M context at standard pricing (no >200k surcharge). New tokenizer uses up to ~35% more tokens for the same English text.
Anthropic $5.00 $25.00 1,000k
Claude Opus 4.6 Anthropic $5.00 $25.00 1,000k
Claude Opus 4.5 Anthropic $5.00 $25.00 200k
Claude Sonnet 4.6
1M context at standard pricing (no >200k surcharge)
Anthropic $3.00 $15.00 1,000k
Claude Sonnet 4.5 Anthropic $3.00 $15.00 200k
Claude Haiku 4.5 Anthropic $1.00 $5.00 200k
Gemini 3.1 Pro (Preview)
Tiered: prompts >200k tokens use 2x rates
Google $2.00 $12.00
Gemini 3.5 Flash Google $1.50 $9.00
Gemini 2.5 Pro
Tiered: prompts >200k tokens use 2x rates
Google $1.25 $10.00
Gemini 2.5 Flash Google $0.30 $2.50
Gemini 3.1 Flash-Lite Google $0.25 $1.50
DeepSeek V4 Pro
75% promotional discount until 2026-05-31 15:59 UTC; post-promo: $1.74 in / $3.48 out per 1M
DeepSeek $0.43 $0.87 1,000k
DeepSeek V4 Flash DeepSeek $0.14 $0.28 1,000k
Grok 4.3 xAI $1.25 $2.50 1,000k
Grok 4.20 Reasoning xAI $1.25 $2.50 1,000k
Grok Build 0.1 xAI $1.00 $2.00 256k
Llama 3.3 70B (Groq) Groq (Llama) $0.59 $0.79 128k
Llama 4 Scout (Groq) Groq (Llama) $0.11 $0.34 128k
Llama 3.1 8B Instant (Groq) Groq (Llama) $0.05 $0.08 128k
Llama 3.3 70B (Together AI) Together AI (Llama) $0.88 $0.88

How this calculator works

We multiply input tokens × requests × input price per 1M and add output tokens × requests × output price per 1M. When you set a cache hit rate, the cached portion of input tokens is billed at the model's cached price (much lower for OpenAI, Anthropic, Google, DeepSeek — not available for Groq, Together, xAI).

Token counts pasted into the textarea use a chars/4 estimate — fast and good enough to compare models. A v2 of this tool will run the real tokenizer per provider (OpenAI tiktoken, Anthropic tokenizer, Google Gemini tokenizer) for exact counts.

What's NOT included in this calculation

  • Batch API discounts (50% off for OpenAI, Anthropic, Google) — use them for non-realtime workloads.
  • Anthropic cache write costs (1.25× for 5min cache, 2× for 1h cache) — only the cache read savings are modeled here.
  • Gemini >200k tier surcharge (2× rates) on `gemini-2.5-pro` and `gemini-3.1-pro-preview`. This MVP applies the base rate; the v2 calculator will branch on prompt size.
  • OpenAI GPT-5.4 session surcharge >272k tokens (2× input + 1.5× output for the entire session).
  • Fast mode (Anthropic Opus 4.6/4.7): 6× multiplier when enabled.
  • Data residency uplift (1.1× for region-locked endpoints).

Pricing freshness

Last verified: 2026-05-27. Sources: each row's official provider docs page. The DeepSeek V4 Pro line currently shows a 75% promo price ending 2026-05-31 15:59 UTC — after that, multiply the displayed v4-pro rates by 4×.

See also our deep-dives: tool comparisons, AI coding tools, AI productivity.