ChatGPT API costs are calculated by multiplying your token usage by the per-token rate for your chosen model: Cost = (Input Tokens x Input Rate) + (Output Tokens x Output Rate). GPT-4o currently runs $2.50 per million input tokens and $10.00 per million output tokens, while GPT-4o-mini costs roughly 6% of that. This guide walks you through the full pricing structure, provides formulas for estimating monthly spend, and includes ready-to-use Python code for building your own token pricing calculator.
Understanding ChatGPT API Token Pricing
OpenAI charges based on the number of tokens processed—both input tokens (your prompts) and output tokens (the model’s responses). Each model has different pricing rates, and prices vary between the preview/older models and the latest GPT-4 variants.
Current Pricing Structure (2026)
The pricing below reflects standard rates for the most commonly used models. Always verify current rates on OpenAI’s pricing page, as rates occasionally change.
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| GPT-4o | $2.50 | $10.00 |
| GPT-4o-mini | $0.15 | $0.60 |
| GPT-4 Turbo | $10.00 | $30.00 |
| GPT-4 | $30.00 | $60.00 |
| GPT-3.5 Turbo | $0.50 | $1.50 |
The pricing follows a simple formula:
Cost = (Input Tokens × Input Rate) + (Output Tokens × Output Rate)
Building a Token Pricing Calculator
Create a Python function that calculates costs based on your expected token usage:
def calculate_chatgpt_cost(
input_tokens: int,
output_tokens: int,
model: str = "gpt-4o"
) -> dict:
"""
Calculate ChatGPT API cost based on token usage.
Args:
input_tokens: Number of tokens in the input prompt
output_tokens: Number of tokens in the model's response
model: The model identifier
Returns:
Dictionary with cost breakdown
"""
# Pricing rates per 1 million tokens
pricing = {
"gpt-4o": {"input": 2.50, "output": 10.00},
"gpt-4o-mini": {"input": 0.15, "output": 0.60},
"gpt-4-turbo": {"input": 10.00, "output": 30.00},
"gpt-4": {"input": 30.00, "output": 60.00},
"gpt-3.5-turbo": {"input": 0.50, "output": 1.50},
}
if model not in pricing:
raise ValueError(f"Unknown model: {model}")
rates = pricing[model]
input_cost = (input_tokens / 1_000_000) * rates["input"]
output_cost = (output_tokens / 1_000_000) * rates["output"]
total_cost = input_cost + output_cost
return {
"model": model,
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"input_cost": round(input_cost, 4),
"output_cost": round(output_cost, 4),
"total_cost": round(total_cost, 4)
}
# Example usage
result = calculate_chatgpt_cost(
input_tokens=500,
output_tokens=1500,
model="gpt-4o"
)
print(f"Cost: ${result['total_cost']}")
Estimating Monthly Usage
To estimate monthly costs, you need to project your usage patterns. Consider these factors:
1. Requests Per Day
Determine how many API calls your application makes daily. If you’re building a chatbot that handles customer support, estimate the average number of conversations and the number of message exchanges per conversation.
2. Tokens Per Request
Token usage varies based on prompt length and response requirements. A good rule of thumb: 1 token equals approximately 4 characters of English text. For more accurate estimates, use OpenAI’s tokenizer library:
import tiktoken
def count_tokens(text: str, model: str = "gpt-4o") -> int:
"""Count tokens using tiktoken library."""
encoding = tiktoken.encoding_for_model(model)
return len(encoding.encode(text))
# Example: count tokens in a prompt
prompt = "Explain how OAuth 2.0 authentication works in 3 sentences."
token_count = count_tokens(prompt)
print(f"Token count: {token_count}")
3. Average Response Length
Different use cases require different response lengths. A code completion tool might need 500+ output tokens, while a simple Q&A bot might only need 100 tokens. Profile your actual usage to get accurate averages.
Practical Examples
Example 1: Customer Support Chatbot
A small business runs a chatbot that handles 100 conversations per day, with an average of 6 message exchanges per conversation.
- Input per message: ~100 tokens (user question + context)
- Output per message: ~150 tokens (AI response)
- Daily requests: 100 × 6 = 600
# Daily cost calculation
daily_requests = 600
tokens_per_request = {"input": 100, "output": 150}
daily_input_tokens = daily_requests * tokens_per_request["input"]
daily_output_tokens = daily_requests * tokens_per_request["output"]
result = calculate_chatgpt_cost(daily_input_tokens, daily_output_tokens, "gpt-4o-mini")
print(f"Daily cost: ${result['total_cost']}")
print(f"Monthly cost: ${result['total_cost'] * 30}")
Using GPT-4o-mini (the most cost-effective option for this use case):
- Daily cost: approximately $0.09
- Monthly cost: approximately $2.70
Example 2: Content Generation API
A SaaS product generates blog post outlines for 50 users, with each user making 10 requests per day.
- Input per request: ~200 tokens (topic description + instructions)
- Output per request: ~800 tokens (detailed outline)
- Daily requests: 50 × 10 = 500
# Monthly cost for content generation
daily_requests = 500
tokens_per_request = {"input": 200, "output": 800}
daily_input = daily_requests * tokens_per_request["input"]
daily_output = daily_requests * tokens_per_request["output"]
result = calculate_chatgpt_cost(daily_input, daily_output, "gpt-4o")
print(f"Monthly cost: ${result['total_cost'] * 30}")
Using GPT-4o:
- Monthly cost: approximately $675
Example 3: Code Review Assistant
A development team integrates an AI code review tool that processes 200 pull requests daily, with an average of 3000 tokens per review (input) and 500 tokens response.
# Annual cost projection for code review tool
daily_requests = 200
tokens_per_request = {"input": 3000, "output": 500}
daily_input = daily_requests * tokens_per_request["input"]
daily_output = daily_requests * tokens_per_request["output"]
result = calculate_chatgpt_cost(daily_input, daily_output, "gpt-4o-mini")
print(f"Annual cost: ${result['total_cost'] * 365}")
Using GPT-4o-mini:
- Annual cost: approximately $4,015
Cost Optimization Strategies
Once you have a calculator running, use it to identify optimization opportunities:
-
Choose the right model: GPT-4o-mini costs roughly 6% of GPT-4o for many tasks. Use the most capable model only when necessary.
-
Implement caching: Cache frequent requests to avoid redundant API calls. A simple TTL cache can significantly reduce costs for repeated queries.
-
Trim prompts: Remove unnecessary context from prompts. Every token saved directly reduces costs.
-
Set output token limits: Use the
max_tokensparameter to cap response length and prevent runaway costs. -
Monitor with alerts: Set up budget alerts using OpenAI’s usage dashboard or build custom monitoring that tracks daily spend.
Using the Calculator for Budget Planning
Create a spreadsheet or dashboard that tracks:
- Expected requests per day/week/month
- Average tokens per request (input and output)
- Model selection per use case
- Total projected cost
Add a buffer of 20-30% for unexpected usage spikes. If your calculated monthly cost is $500, budget for $600-$650 to avoid surprises.
Building a token pricing calculator into your application also helps with client-side cost estimation if you charge users based on their usage. You can pass through OpenAI costs with a margin while giving users transparent pricing.
Start with conservative estimates, measure actual usage after deployment, and refine your calculator based on real-world data. This approach gives you predictable costs and the confidence to scale your AI-powered features.
Related Reading
Built by theluckystrike — More at zovo.one