LLM API Pricing Comparator
Find the best AI model for your budget. Updated pricing for all major LLM providers.
| Model | Provider | Input /1M | Output /1M | Context | Features |
|---|---|---|---|---|---|
GPT-4o Most capable GPT-4 model with multimodal inputs | OpenAI | $5.00 | $15.00 | 128K | VisionFunction callingJSON mode |
GPT-4o Mini Smarter, faster, cheaper than GPT-4 | OpenAI | $0.15 | $0.60 | 128K | FastVisionCost-effective |
GPT-4 Turbo Previous generation, still powerful | OpenAI | $10.00 | $30.00 | 128K | VisionJSON mode |
Claude Opus 4.6 Anthropic's flagship model Cache writes: $6.25 (5m) / $10 (1h), Cache hits: $0.50 | Anthropic | $5.00 | $25.00 | 200K | VisionComputer useArtifacts |
Claude Sonnet 4.6 Best balance of capability and speed Cache writes: $3.75 (5m) / $6 (1h), Cache hits: $0.30 | Anthropic | $3.00 | $15.00 | 200K | VisionComputer useArtifacts |
Claude Haiku 4.5 Fast and efficient for simple tasks Cache writes: $1.25 (5m) / $2 (1h), Cache hits: $0.10 | Anthropic | $1.00 | $5.00 | 200K | FastVisionPrompt caching |
Gemini 2.5 Pro Flagship production model (cached: $0.13) Context caching: 90% discount | $1.25 | $10.00 | 2M | 2M contextVideo understandingCode execution | |
Gemini 2.5 Flash Excellent mid-tier (cached: $0.04) | $0.30 | $2.50 | 1M | 1M contextFastProduction | |
Gemini 2.5 Flash-Lite Cheapest mainstream model, 1K req/day free | $0.10 | $0.40 | 1M | 1M contextBudgetFree tier | |
Gemini 2.0 Flash Budget option | $0.10 | $0.70 | 1M | 1M contextFast | |
Gemini 3.1 Pro Latest preview flagship | $2.00 | $12.00 | 2M | 2M contextPreviewLatest flagship | |
Gemini 3.1 Flash-Lite Budget preview model | $0.25 | $1.50 | 1M | 1M contextPreviewBudget | |
Gemini 3 Flash Fast and affordable preview | $0.50 | $3.00 | 1M | 1M contextPreviewFast | |
Mistral Large European AI, strong on coding | Mistral | $2.00 | $6.00 | 128K | CodingReasoning |
Mistral Small Cheap and fast alternative | Mistral | $0.20 | $0.60 | 128K | FastCost-effective |
Llama 3.1 405B Open source - run yourself or via providers | Meta | Free | Free | 128K | Open sourceRun locally |
Command R+ Built for enterprise AI applications | Cohere | $3.00 | $15.00 | 128K | EnterpriseRAG |
Command R Affordable enterprise option | Cohere | $0.50 | $1.50 | 128K | EnterpriseRAG |
Qwen3-Max Qwen 3 series flagship | Alibaba (Qwen) | $1.20 | $6.00 | 33K | Latest flagshipReasoning |
Qwen2.5-Max Qwen 2.5 flagship | Alibaba (Qwen) | $1.04 | $4.16 | 33K | FlagshipComplex reasoning |
Qwen2.5-Plus Mid-tier Qwen model | Alibaba (Qwen) | $0.26 | $1.56 | 131K | 131K contextBalanced |
Qwen3.5-Flash Best value: 1M context for $0.07 | Alibaba (Qwen) | $0.07 | $0.26 | 1M | 1M contextFastCheap |
Qwen-Turbo Budget: $0.033/1M input! | Alibaba (Qwen) | $0.03 | $0.13 | 131K | 131K contextUltra-cheap |
Qwen2.5-72B Best price/performance for coding | Alibaba (Qwen) | $0.12 | $0.39 | 33K | Coding72B params |
Qwen3.5-9B Cheap 9B with huge context | Alibaba (Qwen) | $0.04 | $0.15 | 262K | 262K contextSmall |
Qwen3-14B Qwen 3 reasoning model | Alibaba (Qwen) | $0.06 | $0.20 | 41K | Reasoning14B |
Qwen3-8B Compact Qwen 3 model | Alibaba (Qwen) | $0.05 | $0.20 | 41K | Reasoning8B |
Qwen3-32B Mid-size Qwen 3 reasoning | Alibaba (Qwen) | $0.08 | $0.24 | 41K | Reasoning32B |
Qwen3.5-27B Large Qwen 3.5 model | Alibaba (Qwen) | $0.20 | $0.90 | 262K | 262K context27B |
Qwen3.5-35B-A3B Mixture of Experts model | Alibaba (Qwen) | $0.16 | $0.90 | 262K | 262K contextMoE |
Grok-4.20 Reasoning Grok 4 with advanced reasoning (cached: $0.20) | xAI | $2.00 | $6.00 | 2M | ReasoningFunctions2M context |
Grok-4.20 Non-Reasoning Grok 4 standard (cached: $0.20) | xAI | $2.00 | $6.00 | 2M | Functions2M context |
Grok-4.1 Fast Reasoning Fast Grok 4 reasoning (cached: $0.05) | xAI | $0.20 | $0.50 | 2M | ReasoningFast2M context |
Grok-4.1 Fast Non-Reasoning Fast Grok 4 standard (cached: $0.05) | xAI | $0.20 | $0.50 | 2M | Fast2M context |
Grok Imagine (Image) $0.02/image | xAI | — | $0.02/image | — | Image generation |
Grok Imagine Pro (Image) $0.07/image | xAI | — | $0.07/image | — | Pro image generation |
Grok Imagine (Video) $0.050/sec | xAI | — | — | — | Video generation |
Grok Voice Agent $0.05/min (Voice Agent API) | xAI | — | $0.05/min (Voice Agent API) | — | Real-time voice |
GLM-5.1 Latest Z.AI flagship (cached: $0.26) | Z.AI | $1.40 | $4.40 | 128K | Latest GLMCached input |
GLM-5 Z.AI main model (cached: $0.20) | Z.AI | $1.00 | $3.20 | 128K | Cached input |
GLM-5-Turbo Z.AI fast variant (cached: $0.24) | Z.AI | $1.20 | $4.00 | 128K | FastCached input |
GLM-4.7 Z.AI previous gen (cached: $0.11) | Z.AI | $0.60 | $2.20 | 128K | Cached input |
GLM-4.7-FlashX Budget option (cached: $0.01) | Z.AI | $0.07 | $0.40 | 128K | Ultra-fast |
GLM-4.5-X High-capability variant (cached: $0.45) | Z.AI | $2.20 | $8.90 | 128K | Extended |
GLM-Image $0.015/image | Z.AI | — | $0.015/image | — | Image generation |
CogView-4 $0.01/image | Z.AI | — | $0.01/image | — | Image generation |
DeepSeek Chat Chinese AI, strong on coding | DeepSeek | $0.14 | $0.28 | 64K | CodeMath |