OpenAI API Pricing Calculator for GPT Costs
Estimate OpenAI API spend by model, input tokens, output tokens, request volume, cached input, batch usage, and optional Responses API tool calls. Use live AI Pricing Hub rows for planning, then verify final billing on OpenAI's official API pricing page.
Quick answer · Pricing data refreshed 2026-03-13 12:45:29
OpenAI API cost is driven by output length, model tier, caching, batch eligibility, and tool calls.
The base estimate is input tokens multiplied by the selected model's input price plus output tokens multiplied by output price. For real products, also separate cached context, batch jobs, web search calls, file search calls, code interpreter or hosted shell sessions, image generation, transcription, and regional processing rules because those line items can change the final bill.
OpenAI API cost estimator
Model a monthly GPT workload with base token pricing, optional cached input, optional batch pricing, and optional tool-call charges. This is a planning estimate, not an invoice.
Selected row
GPT-4o-mini Search Preview
Choose an OpenAI row to see the token formula.
How to use the OpenAI API pricing calculator
1. Pick the model row
Choose the GPT, o-series, embedding, or provider row that matches your deployment path. Direct OpenAI, Azure OpenAI, and aggregator rows can have different pricing and limits.
2. Use real token logs
Estimate with prompt samples first, then replace averages with logged input and output tokens. Output tokens should be tracked separately because they often dominate spend.
3. Add modifiers only when used
Set cached input, batch share, tool calls, and container sessions only for workloads that actually use them. Leave those fields at zero for a plain chat estimate.
Live OpenAI API pricing rows
Rows come from the AI Pricing Hub model database. Use official OpenAI pricing as the final billing reference before launch.
| Model | Provider | Input / 1M | Output / 1M | Cached input | Batch | Context | Capabilities |
|---|---|---|---|---|---|---|---|
| GPT-4o-mini Search Preview | OpenAI | $0.15 | $0.60 | - | - | 128k | chat,tool_use · text · text |
| GPT-4o-mini | OpenAI | $0.15 | $0.60 | - | - | 128k | chat,vision,tool_use · text,image,file · text |
| GPT-4o-mini (2024-07-18) | OpenAI | $0.15 | $0.60 | - | - | 128k | chat,vision,tool_use · text,image,file · text |
| GPT-4.1 Mini | OpenAI | $0.40 | $1.60 | - | - | 1.0M | chat,vision,tool_use · image,text,file · text |
| GPT-4o Audio | OpenAI | $2.50 | $10.00 | - | - | 128k | chat,audio,tts,tool_use · audio,text · text,audio |
| GPT-4o Search Preview | OpenAI | $2.50 | $10.00 | - | - | 128k | chat,tool_use · text · text |
| GPT-4o (2024-11-20) | OpenAI | $2.50 | $10.00 | - | - | 128k | chat,vision,tool_use · text,image,file · text |
| GPT-4o (2024-08-06) | Azure | $2.50 | $10.00 | - | - | 128k | chat,vision,tool_use · text,image,file · text |
| ChatGPT-4o Free row | OpenAI | $5.00 | $15.00 | - | - | 128k | chat,vision · text,image · text |
| GPT-4o (2024-05-13) | OpenAI | $5.00 | $15.00 | - | - | 128k | chat,vision,tool_use · text,image,file · text |
| GPT-4o (extended) | OpenAI | $6.00 | $18.00 | - | - | 128k | chat,vision,tool_use · text,image,file · text |
| ChatGPT-4o Free row | OpenRouter | Free | Free | - | - | 128k | chat,vision · text,image · text |
| GPT-4.1 Nano | OpenAI | $0.10 | $0.40 | - | - | 1.0M | chat,vision,tool_use · image,text,file · text |
| GPT-4.1 | OpenAI | $2.00 | $8.00 | - | - | 1.0M | chat,vision,tool_use · image,text,file · text |
| o4 Mini High | OpenAI | $1.10 | $4.40 | - | - | 200k | chat,vision,reasoning,tool_use · image,text,file · text |
| o4 Mini | OpenAI | $1.10 | $4.40 | - | - | 200k | chat,vision,reasoning,tool_use · image,text,file · text |
| o3 Mini High | OpenAI | $1.10 | $4.40 | - | - | 200k | chat,reasoning,tool_use · text,file · text |
| o3 Mini | OpenAI | $1.10 | $4.40 | - | - | 200k | chat,tool_use,reasoning · text,file · text |
| o4 Mini Deep Research | OpenAI | $2.00 | $8.00 | - | - | 200k | chat,vision,reasoning,tool_use · file,image,text · text |
| o3 | OpenAI | $2.00 | $8.00 | - | - | 200k | chat,vision,reasoning,tool_use · image,text,file · text |
| o3 Deep Research | OpenAI | $10.00 | $40.00 | - | - | 200k | chat,vision,reasoning,tool_use · image,text,file · text |
| o1 | OpenAI | $15.00 | $60.00 | - | - | 200k | chat,vision,tool_use,reasoning · text,image,file · text |
| o3 Pro | OpenAI | $20.00 | $80.00 | - | - | 200k | chat,vision,reasoning,tool_use · text,file,image · text |
| o1-pro | OpenAI | $150.00 | $600.00 | - | - | 200k | chat,vision,reasoning,tool_use · text,image,file · text |
| o1-mini Free row | OpenRouter | Free | Free | - | - | 128k | chat,reasoning · text · text |
| o1-mini Free row | OpenRouter | Free | Free | - | - | 128k | chat,reasoning · text · text |
| o1-preview Free row | OpenRouter | Free | Free | - | - | 128k | chat,reasoning · text · text |
| o1-preview Free row | OpenRouter | Free | Free | - | - | 128k | chat,reasoning · text · text |
| o1-mini (2024-09-12) Free row | OpenRouter | Free | Free | - | - | 128k | chat,reasoning · text · text |
| o1-mini (2024-09-12) Free row | OpenRouter | Free | Free | - | - | 128k | chat,reasoning · text · text |
| o1-preview (2024-09-12) Free row | OpenRouter | Free | Free | - | - | 128k | chat,reasoning · text · text |
| o1-preview (2024-09-12) Free row | OpenRouter | Free | Free | - | - | 128k | chat,reasoning · text · text |
| Text Embedding 3 Small | OpenAI | $0.02 | Free | - | - | 8k | chat,tool_use · text · embeddings |
| Text Embedding Ada 002 | OpenAI | $0.10 | Free | - | - | 8k | chat,tool_use · text · embeddings |
| gpt-oss-20b | Chutes | $0.02 | $0.10 | - | - | 131k | chat,reasoning,tool_use · text · text |
| Text Embedding 3 Large | OpenAI | $0.13 | Free | - | - | 8k | chat,tool_use · text · embeddings |
OpenAI pricing factors to check before production
| Cost factor | Why it matters | Planning rule |
|---|---|---|
| Output tokens | Generated answers, code, and structured JSON can be longer and more expensive than prompts. | Track output length by use case instead of applying one average to chat, agents, and extraction. |
| Cached input | Reusable system prompts, policy text, schemas, tool definitions, and long files may qualify for lower cached input rates. | Cache stable context with high reuse; do not assume one-off prompts receive cache savings. |
| Batch API | Offline jobs can often use lower rates, but they are not suitable for latency-sensitive product flows. | Separate nightly extraction, evaluation, tagging, and backfills from real-time chat traffic. |
| Responses API tools | Web search, file search, and code execution can add per-call, storage, or session charges beyond model tokens. | Budget tool calls separately and cap them by route, user tier, or task type. |
| Provider path | Direct OpenAI, Azure OpenAI, OpenRouter, and enterprise agreements can expose different rows, limits, and billing terms. | Use this page for OpenAI model planning, then compare deployment paths before committing traffic. |
Example OpenAI API cost scenarios
Support chatbot
A support bot using GPT-4o-mini Search Preview might average 1,500 input tokens and 500 output tokens per turn. The biggest budget risk is long conversation history, retries, and tool calls on every answer.
Compare low-cost chat modelsCoding assistant
Coding workflows often generate longer outputs than normal chat. Track code output separately and compare GPT rows against Claude, Gemini, DeepSeek, and other code-capable models.
Open model comparisonDocument extraction
Extraction jobs may send large input documents and short JSON output. Cached input and batch processing usually matter more than the headline chat price.
See cheapest LLM API guideAgent with tools
Agents that call search, file retrieval, or code execution need a separate tool budget. A cheap model can still become expensive if every step triggers paid tools.
Browse OpenAI model rowsAlternatives to compare with OpenAI pricing
OpenAI has broad API coverage, but the cheapest production choice depends on quality, latency, context length, output size, tool support, and provider routing.
| Alternative | Brand | Input / 1M | Output / 1M | Best comparison use |
|---|---|---|---|---|
| GTE-Base | Other | $0.0050 | Free | Baseline price and latency comparison for chat or extraction workloads. |
| E5-Base-v2 | Other | $0.0050 | Free | Baseline price and latency comparison for chat or extraction workloads. |
| paraphrase-MiniLM-L6-v2 | Other | $0.0050 | Free | Baseline price and latency comparison for chat or extraction workloads. |
| all-MiniLM-L12-v2 | Other | $0.0050 | Free | Baseline price and latency comparison for chat or extraction workloads. |
| bge-base-en-v1.5 | Other | $0.0050 | Free | Baseline price and latency comparison for chat or extraction workloads. |
| multi-qa-mpnet-base-dot-v1 | Other | $0.0050 | Free | Baseline price and latency comparison for chat or extraction workloads. |
| all-mpnet-base-v2 | Other | $0.0050 | Free | Baseline price and latency comparison for chat or extraction workloads. |
| all-MiniLM-L6-v2 | Other | $0.0050 | Free | Baseline price and latency comparison for chat or extraction workloads. |
| Qwen3 Embedding 8B | Alibaba Qwen | $0.01 | Free | Coding, agentic development, or output-heavy workloads. |
| Qwen3 Embedding 8B | Alibaba Qwen | $0.01 | Free | Coding, agentic development, or output-heavy workloads. |
Limitations and billing notes
- The calculator uses per-1M token rows from this site's database. Official OpenAI API pricing remains the final billing reference.
- Tool calls, file storage, code execution sessions, image generation, transcription, video generation, fine-tuning, and enterprise commitments may use separate pricing rules.
- Cached input savings depend on the exact model, endpoint, provider row, and whether the reused context qualifies for caching.
- Batch pricing is useful for delayed work, not for interactive chat where the user expects an immediate answer.
- Regional processing, data residency, Azure OpenAI, OpenRouter, and negotiated enterprise terms can change the effective price.
Official references to verify
Use these sources before deploying a budget-sensitive OpenAI workload.
- OpenAI API pricing for current model, batch, multimodal, tool, and specialized-model rates.
- OpenAI prompt caching guide for cached input behavior and cache eligibility.
- OpenAI Batch API guide for delayed workload planning.