pricingllmanalysis

LLM Pricing in February 2026: What Every Model Actually Costs

Kael Tiwari··9 min read·Updated monthly

TL;DR: Cheapest production option is GPT-5 nano at $0.05/M input. Best value is GPT-5 mini at $0.25/M. DeepSeek V3.2 just dropped to $0.28/M input with cache hits at $0.028. Most expensive is GPT-5.2 pro at $21/$168. xAI massively cut Grok pricing since launch. Full table with 30+ models below.


Pricing moves fast enough that whatever you assumed last month is probably wrong. OpenAI added three new GPT-5 variants. xAI slashed Grok-4 from its launch pricing. DeepSeek released V3.2 at less than half what V3.1 cost. Google shipped Gemini 3. And a wave of Chinese labs (Kimi, Qwen, GLM) showed up on hosted APIs at prices that make the big labs look expensive.

Here's the full picture as of February 22, 2026.

The full pricing table

All prices per million tokens. Sorted by provider.

OpenAI

ModelInputOutputNotes
GPT-5.2$1.75$14.00Flagship. Best overall for coding + agents
GPT-5.2 pro$21.00$168.00"Smartest and most precise." Overkill for most use cases
GPT-5.1$1.25$10.00Previous flagship, still strong
GPT-5$1.25$10.00Base GPT-5, same price as 5.1
GPT-5 mini$0.25$2.00Best price/performance for most tasks
GPT-5 nano$0.05$0.40Dirt cheap. Classification, extraction, routing
GPT-4.1$2.00$8.00Still widely deployed. Costs more than GPT-5.1 on input
GPT-4.1 mini$0.40$1.60Solid mid-tier
GPT-4.1 nano$0.10$0.40Budget tier
o4-mini$1.10$4.40Reasoning model
o3$2.00$8.00Reasoning, same price as GPT-4.1
o3-pro$20.00$80.00Premium reasoning
GPT-OSS-120B$0.15$0.60Open-weight, via hosted APIs (Together.ai)
GPT-OSS-20B$0.05$0.20Smallest open-weight. Cheapest model on this list

Anthropic

ModelInputOutputNotes
Claude Opus 4.6$5.00$25.00Top-tier reasoning + coding. 200K context (1M in beta)
Claude Sonnet 4.6$3.00$15.00Workhorse. Training data through Jan 2026
Claude Haiku 4.5$1.00$5.00Fast and cheap. Extended thinking support

Anthropic's lineup hasn't changed since last month. Opus 4.6 at $5/$25 is still a solid deal compared to the old Opus 4 pricing ($15/$75). Legacy models (Sonnet 4.5, Opus 4.5, Opus 4.1, Sonnet 4) are still available at various price points.

Google

ModelInputOutputNotes
Gemini 3.1 Pro Preview$2.00$12.00Latest. Image output support ($120/M tokens)
Gemini 3 Pro Preview$2.00$12.00Same pricing as 3.1
Gemini 3 Flash Preview$0.50$3.00Audio input at $1/M
Gemini 2.5 Pro$1.25$10.00Mature, stable. Computer use preview available
Gemini 2.5 Flash$0.30$2.50Strong on long context

Google doubled their lineup. Gemini 3 Pro sits at $2/$12, which puts it right between GPT-5.1 and Claude Sonnet 4.6 on output cost. The bigger story is Gemini 3 Flash at $0.50/$3, which is pricier than Gemini 2.5 Flash but presumably better quality. Google also charges double for long context (>200K tokens) on Pro models.

xAI

ModelInputOutputNotes
Grok-4$3.00$15.00Flagship reasoning. 256K context
Grok-4.1 fast (reasoning)$0.20$0.502M context. Absurdly cheap for the context window
Grok-4.1 fast (non-reasoning)$0.20$0.50Same price, no chain-of-thought
Grok Code Fast 1$0.20$1.50Code specialist. 256K context
Grok-3$3.00$15.00Previous gen, same price as Grok-4
Grok-3-mini$0.30$0.50Budget reasoning

xAI's pricing strategy flipped completely. Grok-4 flagship at $3/$15 puts it on par with Claude Sonnet 4.6. But the real play is Grok-4.1 fast at $0.20/$0.50 with a 2 million token context window. That's 10x the context of most competitors at a fraction of the price. If your workload is context-heavy, this might be the best deal on the market.

DeepSeek

ModelInputOutputNotes
DeepSeek V3.2 (non-thinking)$0.28$0.42128K context. Cache hits at $0.028
DeepSeek V3.2 (thinking)$0.28$0.42Same model, reasoning mode

DeepSeek V3.2 replaced V3.1 and the pricing dropped hard. Input went from $0.60 to $0.28. Output from $1.70 to $0.42. Cache hits at $0.028/M input make repeat queries almost free. Both chat and reasoning modes are the same model now, just toggled via API. At this price, DeepSeek is cheaper than GPT-5 nano on output.

Open-weight and newer labs (via Together.ai, Groq)

ModelInputOutputNotes
Llama 4 Maverick$0.27$0.85Meta. Self-hostable
Kimi K2.5$0.50$2.80Moonshot AI. Strong on Chinese + English
Qwen3.5 397B$0.60$3.60Alibaba. Massive MoE
Qwen3 Coder Next$0.50$1.20Code specialist
GLM-5$1.00$3.20Zhipu AI
MiniMax M2.5$0.30$1.20Competitive pricing
DeepSeek R1$3.00$7.00Reasoning model (via Together)
Gemma 3n E4B$0.02$0.04Google's open-weight. Cheapest option period

The Chinese model ecosystem got crowded. Kimi K2.5, Qwen3.5, and GLM-5 are all available through Together.ai at prices that undercut the major US labs. Whether the quality matches for English-language workloads depends on the task, but for bilingual or Chinese-market applications, these are worth testing.

What changed since last month

OpenAI expanded aggressively. GPT-5.1, GPT-5, GPT-5 nano, GPT-5.2 pro, and o3/o3-pro all appeared. The GPT-5 family now covers every price point from $0.05 to $21 per million input tokens. GPT-4.1 still exists but there's less reason to use it when GPT-5 and 5.1 cost $1.25.

xAI stopped being the expensive outlier. Grok-4 at $3/$15 is competitive with Claude Sonnet. The Grok-4.1 fast tier at $0.20/$0.50 is one of the cheapest options from any major lab, and it comes with 2M token context. That's a significant shift from their earlier premium positioning.

DeepSeek undercut itself. V3.2 at $0.28/$0.42 is less than half what V3.1 cost. With cache hits at $0.028, high-volume users with repetitive prompts pay almost nothing.

Google went multi-generational. You can now run Gemini 2.5 Flash ($0.30/$2.50), Gemini 2.5 Pro ($1.25/$10), Gemini 3 Flash ($0.50/$3), or Gemini 3 Pro ($2/$12). Pick based on quality vs cost requirements.

Chinese labs are everywhere. Kimi, Qwen, GLM, MiniMax all have models on hosted APIs. Pricing is competitive. Quality varies by task and language.

Beyond the price tag

Output tokens still cost 3-8x more than input across every provider. If your app generates long responses, output cost dominates your bill. Trim your outputs.

Caching is now standard. OpenAI, Anthropic, DeepSeek, xAI, and Google all offer prompt caching that cuts repeat-context costs by 50-90%. DeepSeek's cache hit rate at $0.028/M (10x cheaper than base) is the most aggressive discount.

Context windows got massive. Grok-4.1 fast at 2M tokens, Claude at 1M (beta), Gemini 3 at large windows. If you were truncating context to save money, recalculate. The per-token cost dropped enough that longer context might be cheaper than the engineering effort to work around it.

Reasoning models are worth the premium for the right tasks. o3 at $2/$8, o4-mini at $1.10/$4.40, and Grok-4 at $3/$15 all perform reasoning before responding. For multi-step problems, they often get it right in one shot where a cheaper model needs 3-4 retries.

Who should use what

High-volume production (chatbots, classification, extraction): GPT-5 nano ($0.05/$0.40) or DeepSeek V3.2 ($0.28/$0.42). Both handle 80%+ of tasks fine. DeepSeek's cache hits make it essentially free for repetitive workloads.

Code generation: Claude Sonnet 4.6, GPT-5.2, or Grok Code Fast 1 ($0.20/$1.50). Sonnet handles complex instructions well. Grok Code is surprisingly good for the price.

Long-context workloads: Grok-4.1 fast. 2M tokens at $0.20/$0.50 is unbeatable. No one else comes close on context-per-dollar.

Research and analysis: Claude Opus 4.6 if budget allows. Gemini 3 Pro or GPT-5.2 if not.

Reasoning-heavy tasks: o3 ($2/$8) or o4-mini ($1.10/$4.40) for chain-of-thought. o3-pro ($20/$80) for hard problems where accuracy matters more than cost.

Cost-sensitive startups: Gemma 3n ($0.02/$0.04) or GPT-OSS-20B ($0.05/$0.20) self-hosted. Get to market first, pick the right model later.

What's next

The trend from last year held. Equivalent quality got roughly 10x cheaper. GPT-5 nano gives you GPT-4-class performance at $0.05/M input. A year ago that cost $2.50.

The bigger shift is competition. OpenAI, Anthropic, and Google used to set the pace. Now xAI, DeepSeek, and a half-dozen Chinese labs are forcing prices down faster. Grok-4.1 fast at $0.20/$0.50 with 2M context would've been unthinkable six months ago.

By Q4 2026, expect GPT-5 mini-equivalent quality below $0.05/M input. Self-hosting economics are getting tighter as hosted APIs race to the bottom.

We'll keep this comparison updated as pricing changes. Subscribe to get updates.


Sources: OpenAI pricing, Anthropic models, Google Vertex AI pricing, xAI models, DeepSeek pricing, Together.ai. All checked February 22, 2026.

This analysis is part of Kael Research's ongoing coverage of AI market economics. We track pricing, adoption, and competition across the AI industry. See our full research briefs for deeper analysis on specific markets.

K

Kael Tiwari

AI market intelligence for investors and founders

Want more analysis like this?

Weekly AI market intelligence with sourced data. Free.

Subscribe Free

More from Kael Research