LLM Pricing in February 2026: What Every Model Actually Costs
TL;DR: Cheapest production option is GPT-5 nano at $0.05/M input. Best value is GPT-5 mini at $0.25/M. DeepSeek V3.2 just dropped to $0.28/M input with cache hits at $0.028. Most expensive is GPT-5.2 pro at $21/$168. xAI massively cut Grok pricing since launch. Full table with 30+ models below.
Pricing moves fast enough that whatever you assumed last month is probably wrong. OpenAI added three new GPT-5 variants. xAI slashed Grok-4 from its launch pricing. DeepSeek released V3.2 at less than half what V3.1 cost. Google shipped Gemini 3. And a wave of Chinese labs (Kimi, Qwen, GLM) showed up on hosted APIs at prices that make the big labs look expensive.
Here's the full picture as of February 22, 2026.
The full pricing table
All prices per million tokens. Sorted by provider.
OpenAI
| Model | Input | Output | Notes |
|---|---|---|---|
| GPT-5.2 | $1.75 | $14.00 | Flagship. Best overall for coding + agents |
| GPT-5.2 pro | $21.00 | $168.00 | "Smartest and most precise." Overkill for most use cases |
| GPT-5.1 | $1.25 | $10.00 | Previous flagship, still strong |
| GPT-5 | $1.25 | $10.00 | Base GPT-5, same price as 5.1 |
| GPT-5 mini | $0.25 | $2.00 | Best price/performance for most tasks |
| GPT-5 nano | $0.05 | $0.40 | Dirt cheap. Classification, extraction, routing |
| GPT-4.1 | $2.00 | $8.00 | Still widely deployed. Costs more than GPT-5.1 on input |
| GPT-4.1 mini | $0.40 | $1.60 | Solid mid-tier |
| GPT-4.1 nano | $0.10 | $0.40 | Budget tier |
| o4-mini | $1.10 | $4.40 | Reasoning model |
| o3 | $2.00 | $8.00 | Reasoning, same price as GPT-4.1 |
| o3-pro | $20.00 | $80.00 | Premium reasoning |
| GPT-OSS-120B | $0.15 | $0.60 | Open-weight, via hosted APIs (Together.ai) |
| GPT-OSS-20B | $0.05 | $0.20 | Smallest open-weight. Cheapest model on this list |
Anthropic
| Model | Input | Output | Notes |
|---|---|---|---|
| Claude Opus 4.6 | $5.00 | $25.00 | Top-tier reasoning + coding. 200K context (1M in beta) |
| Claude Sonnet 4.6 | $3.00 | $15.00 | Workhorse. Training data through Jan 2026 |
| Claude Haiku 4.5 | $1.00 | $5.00 | Fast and cheap. Extended thinking support |
Anthropic's lineup hasn't changed since last month. Opus 4.6 at $5/$25 is still a solid deal compared to the old Opus 4 pricing ($15/$75). Legacy models (Sonnet 4.5, Opus 4.5, Opus 4.1, Sonnet 4) are still available at various price points.
| Model | Input | Output | Notes |
|---|---|---|---|
| Gemini 3.1 Pro Preview | $2.00 | $12.00 | Latest. Image output support ($120/M tokens) |
| Gemini 3 Pro Preview | $2.00 | $12.00 | Same pricing as 3.1 |
| Gemini 3 Flash Preview | $0.50 | $3.00 | Audio input at $1/M |
| Gemini 2.5 Pro | $1.25 | $10.00 | Mature, stable. Computer use preview available |
| Gemini 2.5 Flash | $0.30 | $2.50 | Strong on long context |
Google doubled their lineup. Gemini 3 Pro sits at $2/$12, which puts it right between GPT-5.1 and Claude Sonnet 4.6 on output cost. The bigger story is Gemini 3 Flash at $0.50/$3, which is pricier than Gemini 2.5 Flash but presumably better quality. Google also charges double for long context (>200K tokens) on Pro models.
xAI
| Model | Input | Output | Notes |
|---|---|---|---|
| Grok-4 | $3.00 | $15.00 | Flagship reasoning. 256K context |
| Grok-4.1 fast (reasoning) | $0.20 | $0.50 | 2M context. Absurdly cheap for the context window |
| Grok-4.1 fast (non-reasoning) | $0.20 | $0.50 | Same price, no chain-of-thought |
| Grok Code Fast 1 | $0.20 | $1.50 | Code specialist. 256K context |
| Grok-3 | $3.00 | $15.00 | Previous gen, same price as Grok-4 |
| Grok-3-mini | $0.30 | $0.50 | Budget reasoning |
xAI's pricing strategy flipped completely. Grok-4 flagship at $3/$15 puts it on par with Claude Sonnet 4.6. But the real play is Grok-4.1 fast at $0.20/$0.50 with a 2 million token context window. That's 10x the context of most competitors at a fraction of the price. If your workload is context-heavy, this might be the best deal on the market.
DeepSeek
| Model | Input | Output | Notes |
|---|---|---|---|
| DeepSeek V3.2 (non-thinking) | $0.28 | $0.42 | 128K context. Cache hits at $0.028 |
| DeepSeek V3.2 (thinking) | $0.28 | $0.42 | Same model, reasoning mode |
DeepSeek V3.2 replaced V3.1 and the pricing dropped hard. Input went from $0.60 to $0.28. Output from $1.70 to $0.42. Cache hits at $0.028/M input make repeat queries almost free. Both chat and reasoning modes are the same model now, just toggled via API. At this price, DeepSeek is cheaper than GPT-5 nano on output.
Open-weight and newer labs (via Together.ai, Groq)
| Model | Input | Output | Notes |
|---|---|---|---|
| Llama 4 Maverick | $0.27 | $0.85 | Meta. Self-hostable |
| Kimi K2.5 | $0.50 | $2.80 | Moonshot AI. Strong on Chinese + English |
| Qwen3.5 397B | $0.60 | $3.60 | Alibaba. Massive MoE |
| Qwen3 Coder Next | $0.50 | $1.20 | Code specialist |
| GLM-5 | $1.00 | $3.20 | Zhipu AI |
| MiniMax M2.5 | $0.30 | $1.20 | Competitive pricing |
| DeepSeek R1 | $3.00 | $7.00 | Reasoning model (via Together) |
| Gemma 3n E4B | $0.02 | $0.04 | Google's open-weight. Cheapest option period |
The Chinese model ecosystem got crowded. Kimi K2.5, Qwen3.5, and GLM-5 are all available through Together.ai at prices that undercut the major US labs. Whether the quality matches for English-language workloads depends on the task, but for bilingual or Chinese-market applications, these are worth testing.
What changed since last month
OpenAI expanded aggressively. GPT-5.1, GPT-5, GPT-5 nano, GPT-5.2 pro, and o3/o3-pro all appeared. The GPT-5 family now covers every price point from $0.05 to $21 per million input tokens. GPT-4.1 still exists but there's less reason to use it when GPT-5 and 5.1 cost $1.25.
xAI stopped being the expensive outlier. Grok-4 at $3/$15 is competitive with Claude Sonnet. The Grok-4.1 fast tier at $0.20/$0.50 is one of the cheapest options from any major lab, and it comes with 2M token context. That's a significant shift from their earlier premium positioning.
DeepSeek undercut itself. V3.2 at $0.28/$0.42 is less than half what V3.1 cost. With cache hits at $0.028, high-volume users with repetitive prompts pay almost nothing.
Google went multi-generational. You can now run Gemini 2.5 Flash ($0.30/$2.50), Gemini 2.5 Pro ($1.25/$10), Gemini 3 Flash ($0.50/$3), or Gemini 3 Pro ($2/$12). Pick based on quality vs cost requirements.
Chinese labs are everywhere. Kimi, Qwen, GLM, MiniMax all have models on hosted APIs. Pricing is competitive. Quality varies by task and language.
Beyond the price tag
Output tokens still cost 3-8x more than input across every provider. If your app generates long responses, output cost dominates your bill. Trim your outputs.
Caching is now standard. OpenAI, Anthropic, DeepSeek, xAI, and Google all offer prompt caching that cuts repeat-context costs by 50-90%. DeepSeek's cache hit rate at $0.028/M (10x cheaper than base) is the most aggressive discount.
Context windows got massive. Grok-4.1 fast at 2M tokens, Claude at 1M (beta), Gemini 3 at large windows. If you were truncating context to save money, recalculate. The per-token cost dropped enough that longer context might be cheaper than the engineering effort to work around it.
Reasoning models are worth the premium for the right tasks. o3 at $2/$8, o4-mini at $1.10/$4.40, and Grok-4 at $3/$15 all perform reasoning before responding. For multi-step problems, they often get it right in one shot where a cheaper model needs 3-4 retries.
Who should use what
High-volume production (chatbots, classification, extraction): GPT-5 nano ($0.05/$0.40) or DeepSeek V3.2 ($0.28/$0.42). Both handle 80%+ of tasks fine. DeepSeek's cache hits make it essentially free for repetitive workloads.
Code generation: Claude Sonnet 4.6, GPT-5.2, or Grok Code Fast 1 ($0.20/$1.50). Sonnet handles complex instructions well. Grok Code is surprisingly good for the price.
Long-context workloads: Grok-4.1 fast. 2M tokens at $0.20/$0.50 is unbeatable. No one else comes close on context-per-dollar.
Research and analysis: Claude Opus 4.6 if budget allows. Gemini 3 Pro or GPT-5.2 if not.
Reasoning-heavy tasks: o3 ($2/$8) or o4-mini ($1.10/$4.40) for chain-of-thought. o3-pro ($20/$80) for hard problems where accuracy matters more than cost.
Cost-sensitive startups: Gemma 3n ($0.02/$0.04) or GPT-OSS-20B ($0.05/$0.20) self-hosted. Get to market first, pick the right model later.
What's next
The trend from last year held. Equivalent quality got roughly 10x cheaper. GPT-5 nano gives you GPT-4-class performance at $0.05/M input. A year ago that cost $2.50.
The bigger shift is competition. OpenAI, Anthropic, and Google used to set the pace. Now xAI, DeepSeek, and a half-dozen Chinese labs are forcing prices down faster. Grok-4.1 fast at $0.20/$0.50 with 2M context would've been unthinkable six months ago.
By Q4 2026, expect GPT-5 mini-equivalent quality below $0.05/M input. Self-hosting economics are getting tighter as hosted APIs race to the bottom.
We'll keep this comparison updated as pricing changes. Subscribe to get updates.
Sources: OpenAI pricing, Anthropic models, Google Vertex AI pricing, xAI models, DeepSeek pricing, Together.ai. All checked February 22, 2026.
This analysis is part of Kael Research's ongoing coverage of AI market economics. We track pricing, adoption, and competition across the AI industry. See our full research briefs for deeper analysis on specific markets.
Kael Tiwari
AI market intelligence for investors and founders