LLM Cost Calculator
Estimate the cost of any LLM API call across 26 models from 7 providers. Pricing verified against official docs on 2026-05-27.
Cost per month, all models
| Model | Provider | $ / 1M in | $ / 1M out | Context | Monthly cost |
|---|---|---|---|---|---|
| GPT-5.5 Pro context_window/max_output unverified on pricing page | OpenAI | $30.00 | $180.00 | — | — |
| GPT-5.5 | OpenAI | $5.00 | $30.00 | 1,050k | — |
| GPT-5.4 Inputs >272k tokens incur 2x input + 1.5x output for the full session | OpenAI | $2.50 | $15.00 | 1,050k | — |
| GPT-5.4 Mini | OpenAI | $0.75 | $4.50 | — | — |
| GPT-5.4 Nano | OpenAI | $0.20 | $1.25 | — | — |
| Claude Opus 4.1 | Anthropic | $15.00 | $75.00 | 200k | — |
| Claude Opus 4.7 1M context at standard pricing (no >200k surcharge). New tokenizer uses up to ~35% more tokens for the same English text. | Anthropic | $5.00 | $25.00 | 1,000k | — |
| Claude Opus 4.6 | Anthropic | $5.00 | $25.00 | 1,000k | — |
| Claude Opus 4.5 | Anthropic | $5.00 | $25.00 | 200k | — |
| Claude Sonnet 4.6 1M context at standard pricing (no >200k surcharge) | Anthropic | $3.00 | $15.00 | 1,000k | — |
| Claude Sonnet 4.5 | Anthropic | $3.00 | $15.00 | 200k | — |
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | 200k | — |
| Gemini 3.1 Pro (Preview) Tiered: prompts >200k tokens use 2x rates | $2.00 | $12.00 | — | — | |
| Gemini 3.5 Flash | $1.50 | $9.00 | — | — | |
| Gemini 2.5 Pro Tiered: prompts >200k tokens use 2x rates | $1.25 | $10.00 | — | — | |
| Gemini 2.5 Flash | $0.30 | $2.50 | — | — | |
| Gemini 3.1 Flash-Lite | $0.25 | $1.50 | — | — | |
| DeepSeek V4 Pro 75% promotional discount until 2026-05-31 15:59 UTC; post-promo: $1.74 in / $3.48 out per 1M | DeepSeek | $0.43 | $0.87 | 1,000k | — |
| DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 | 1,000k | — |
| Grok 4.3 | xAI | $1.25 | $2.50 | 1,000k | — |
| Grok 4.20 Reasoning | xAI | $1.25 | $2.50 | 1,000k | — |
| Grok Build 0.1 | xAI | $1.00 | $2.00 | 256k | — |
| Llama 3.3 70B (Groq) | Groq (Llama) | $0.59 | $0.79 | 128k | — |
| Llama 4 Scout (Groq) | Groq (Llama) | $0.11 | $0.34 | 128k | — |
| Llama 3.1 8B Instant (Groq) | Groq (Llama) | $0.05 | $0.08 | 128k | — |
| Llama 3.3 70B (Together AI) | Together AI (Llama) | $0.88 | $0.88 | — | — |
How this calculator works
We multiply input tokens × requests × input price per 1M and add output tokens × requests × output price per 1M. When you set a cache hit rate, the cached portion of input tokens is billed at the model's cached price (much lower for OpenAI, Anthropic, Google, DeepSeek — not available for Groq, Together, xAI).
Token counts pasted into the textarea use a chars/4 estimate — fast and good enough to compare models. A v2 of this tool will run the real tokenizer per provider (OpenAI tiktoken, Anthropic tokenizer, Google Gemini tokenizer) for exact counts.
What's NOT included in this calculation
- Batch API discounts (50% off for OpenAI, Anthropic, Google) — use them for non-realtime workloads.
- Anthropic cache write costs (1.25× for 5min cache, 2× for 1h cache) — only the cache read savings are modeled here.
- Gemini >200k tier surcharge (2× rates) on `gemini-2.5-pro` and `gemini-3.1-pro-preview`. This MVP applies the base rate; the v2 calculator will branch on prompt size.
- OpenAI GPT-5.4 session surcharge >272k tokens (2× input + 1.5× output for the entire session).
- Fast mode (Anthropic Opus 4.6/4.7): 6× multiplier when enabled.
- Data residency uplift (1.1× for region-locked endpoints).
Pricing freshness
Last verified: 2026-05-27. Sources: each row's official provider docs page. The DeepSeek V4 Pro line currently shows a 75% promo price ending 2026-05-31 15:59 UTC — after that, multiply the displayed v4-pro rates by 4×.
See also our deep-dives: tool comparisons, AI coding tools, AI productivity.