How do I calculate Gemini API cost?

Multiply input tokens by input price, output tokens by output price, then add feature-specific costs such as Search grounding or cache storage when used.

Some Gemini tiers include free usage, but rate limits, region, product tier, and feature availability can apply.

Gemini API Pricing Calculator for Google AI Costs

Estimate Gemini API spend by model, input tokens, output tokens, request volume, cache usage, batch processing, audio share, and optional Google Search grounding. Use live AI Pricing Hub rows for planning, then verify production rates in Google's official Gemini API pricing table.

Estimate Gemini cost View Gemini prices

Quick answer · Pricing data refreshed 2026-03-13 12:45:29

Gemini API cost is usually driven by output length, audio input, cache strategy, and grounding.

For a text chat or extraction workload, the basic estimate is input tokens multiplied by the model input price plus output tokens multiplied by output price. Gemini workloads become harder to budget when you add audio, long context, Search grounding, or cached documents. This calculator keeps those knobs visible so you can model a realistic monthly bill before moving traffic to Google AI Studio, Vertex AI, OpenRouter, or another provider row.

Tracked Gemini rows

18 free or zero-price rows found in the pricing database

Default calculator model

Gemini 2.5 Flash Lite Preview 09-2025

$0.10 input / $0.40 output

Lowest paid blended row

Gemini 2.0 Flash Lite

$0.38 combined per 1M

Gemini API cost estimator

Model token spend for a monthly workload. The estimate covers model token cost plus optional grounding prompts; it does not include Cloud logging, storage, network, fine-tuning, or enterprise support charges.

Gemini model or provider row

Input tokens/request

Output tokens/request

Requests/month

Cached input share Percent of input tokens charged at a cache-read price when the row has one.

Batch share Percent of requests eligible for batch prices when available.

Audio input share Use when audio tokens are priced above text/image/video input.

Audio input price multiplier Official Gemini rows often price audio input higher than text input; adjust per model.

Grounded prompts/month Only count prompts using Google Search grounding.

Grounding price per 1K Google's public tables vary by model family; edit this value.

Monthly estimate $0.00

Per request $0.000000

Token cost $0.00

Grounding cost $0.00

Selected row

Gemini 2.5 Flash Lite Preview 09-2025

Choose a Gemini row to see the token formula.

How to use the Gemini API pricing calculator

1. Pick the model row

Choose the Gemini model and provider row that matches your deployment path, such as Google AI Studio, Vertex AI, OpenRouter, or a zero-price testing route.

2. Estimate tokens

Use request logs, the Gemini countTokens API, or a rough text estimate. Google's token guide says Gemini tokens are roughly four characters, but production requests should be measured.

3. Add real modifiers

Set cache share, batch share, audio share, and grounded prompt count only when your application actually uses those features. Leave them at zero for a plain chat estimate.

Live Gemini API pricing rows

Rows come from the AI Pricing Hub model database. Use official Google pricing as the source of truth before launch.

Model	Provider	Input / 1M	Output / 1M	Cached input	Batch	Context	Modalities
Gemini 2.5 Flash Preview 09-2025 Free row	OpenRouter	Free	Free	-	-	1.0M	image,file,text,audio,video → text
Gemini 2.5 Flash Image Preview (Nano Banana) Free row	OpenRouter	Free	Free	-	-	33k	image,text → image,text
Gemini 2.5 Flash Lite Preview 09-2025	Google AI Studio	$0.10	$0.40	-	-	1.0M	text,image,file,audio,video → text
Gemini 2.5 Flash Lite	Google	$0.10	$0.40	-	-	1.0M	text,image,file,audio,video → text
Gemini 2.5 Flash Preview 09-2025 Free row	Google AI Studio	$0.30	$2.50	-	-	1.0M	image,file,text,audio,video → text
Gemini 2.5 Flash	Google	$0.30	$2.50	-	-	1.0M	file,image,text,audio,video → text
Gemini 2.5 Flash Image Preview (Nano Banana) Free row	Google AI Studio	$0.30	$2.50	-	-	33k	image,text → image,text
Gemini 2.5 Flash Image (Nano Banana)	Google AI Studio	$0.30	$2.50	-	-	33k	image,text → image,text
Gemini 2.0 Flash Experimental (free) Free row	Google	Free	Free	-	-	1.0M	text,image → text
Gemini 2.0 Flash Experimental (free) Free row	OpenRouter	Free	Free	-	-	1.0M	text,image → text
Gemini 1.5 Flash 8B Free row	OpenRouter	Free	Free	-	-	1.0M	text,image → text
Gemini 1.5 Flash 8B Free row	OpenRouter	Free	Free	-	-	1.0M	text,image → text
Gemini 1.5 Flash Experimental Free row	OpenRouter	Free	Free	-	-	1.0M	text,image → text
Gemini 1.5 Flash Experimental Free row	OpenRouter	Free	Free	-	-	1.0M	text,image → text
Gemini 1.5 Flash Free row	OpenRouter	Free	Free	-	-	1.0M	text,image → text
Gemini 1.5 Flash Free row	OpenRouter	Free	Free	-	-	1.0M	text,image → text
Gemini 2.0 Flash Lite	Google	$0.07	$0.30	-	-	1.0M	text,image,file,audio,video → text
Gemini 2.0 Flash	Google AI Studio	$0.10	$0.40	-	-	1.0M	text,image,file,audio,video → text
Gemini 3 Flash Preview	Google	$0.50	$3.00	-	-	1.0M	text,image,file,audio,video → text
Gemini 3 Flash Preview	Google AI Studio	$0.50	$3.00	-	-	1.0M	text,image,file,audio,video → text
Gemini 1.5 Pro Free row	OpenRouter	Free	Free	-	-	2.0M	text,image → text
Gemini 1.5 Pro Free row	OpenRouter	Free	Free	-	-	2.0M	text,image → text
Gemini 2.5 Pro Experimental Free row	OpenRouter	Free	Free	-	-	1.0M	text,image,file → text
Gemini 2.5 Pro Experimental Free row	OpenRouter	Free	Free	-	-	1.0M	text,image,file → text
Gemini 1.5 Pro Experimental Free row	OpenRouter	Free	Free	-	-	1.0M	text,image → text
Gemini 1.5 Pro Experimental Free row	OpenRouter	Free	Free	-	-	1.0M	text,image → text
Gemini 2.5 Pro	Google	$1.25	$10.00	-	-	1.0M	text,image,file,audio,video → text
Gemini 2.5 Pro Preview 06-05	Google	$1.25	$10.00	-	-	1.0M	file,image,text,audio → text
Gemini 2.5 Pro Preview 05-06	Google	$1.25	$10.00	-	-	1.0M	text,image,file,audio,video → text
Gemini 3 Pro Preview	Google	$2.00	$12.00	-	-	1.0M	text,image,file,audio,video → text
Nano Banana Pro (Gemini 3 Pro Image Preview)	Google	$2.00	$12.00	-	-	66k	image,text → image,text
Nano Banana Pro (Gemini 3 Pro Image Preview)	Google AI Studio	$2.00	$12.00	-	-	66k	image,text → image,text

Gemini pricing factors to check before production

Cost factor	Why it matters	Planning rule
Output tokens	Generated text can be longer than the prompt and is often priced higher than input.	Track output length separately for chat, coding, extraction, and summarization workloads.
Audio input	Google's public Gemini tables commonly distinguish text/image/video input from audio input.	Use a separate audio share estimate instead of applying a pure text token budget to voice products.
Context caching	Caching can reduce repeated-context token cost, but explicit cache storage duration can add another line item.	Cache stable system prompts, manuals, policies, and tool definitions only when reuse is high enough.
Batch processing	Delayed workloads may qualify for lower batch token prices on supported providers.	Route offline evaluation, tagging, and nightly extraction jobs to batch instead of real-time endpoints.
Grounding	Google Search grounding can be priced per grounded prompt or search query after a free allowance.	Only ground prompts that need fresh web context; avoid grounding every chat turn by default.

Example Gemini API cost scenarios

Support chatbot

A support bot using Gemini 2.5 Flash Lite Preview 09-2025 might average 1,200 input tokens and 450 output tokens per turn. The biggest budget risk is not the prompt; it is retries, long conversation history, and grounding every answer when only a fraction of turns need fresh web context.

Compare low-cost chat models

Document extraction

A document extraction workflow may send 20,000 input tokens and only 700 output tokens per request. For this pattern, input price, context window, cache hits, and batch eligibility usually matter more than headline output price.

See cheapest LLM API guide

Voice or meeting analysis

Voice workloads should not copy a text-only estimate. Audio tokens, transcript length, summarization output, and whether you keep cached meeting context can move the bill even when request count is stable.

Browse Google AI models

Grounded research assistant

Research assistants often need Search grounding, but not every step needs it. Split grounded prompts from normal reasoning prompts and give each one a separate usage target.

Use general AI cost calculator

Alternatives to compare with Gemini pricing

Gemini is often strong for multimodal and long-context work, but the cheapest production choice depends on quality, latency, output length, and provider routing.

Alternative	Brand	Input / 1M	Output / 1M	Best comparison use
GTE-Base	Other	$0.0050	Free	Baseline price and latency comparison for chat or extraction workloads.
E5-Base-v2	Other	$0.0050	Free	Baseline price and latency comparison for chat or extraction workloads.
paraphrase-MiniLM-L6-v2	Other	$0.0050	Free	Baseline price and latency comparison for chat or extraction workloads.
all-MiniLM-L12-v2	Other	$0.0050	Free	Baseline price and latency comparison for chat or extraction workloads.
bge-base-en-v1.5	Other	$0.0050	Free	Baseline price and latency comparison for chat or extraction workloads.
multi-qa-mpnet-base-dot-v1	Other	$0.0050	Free	Baseline price and latency comparison for chat or extraction workloads.
all-mpnet-base-v2	Other	$0.0050	Free	Baseline price and latency comparison for chat or extraction workloads.
all-MiniLM-L6-v2	Other	$0.0050	Free	Baseline price and latency comparison for chat or extraction workloads.
Qwen3 Embedding 8B	Alibaba Qwen	$0.01	Free	Baseline price and latency comparison for chat or extraction workloads.
Qwen3 Embedding 8B	Alibaba Qwen	$0.01	Free	Baseline price and latency comparison for chat or extraction workloads.

Claude pricing calculator Azure OpenAI pricing GPT-OSS pricing calculator

Limitations and billing notes

The calculator uses per-1M token rows from this site's database. Official Google Gemini pricing remains the final billing reference.
Free tier availability is not the same as unlimited free usage. Rate limits, regional availability, paid-tier setup, and feature restrictions can apply.
Search grounding, image generation, live audio, embeddings, fine-tuning, cache storage, and cloud infrastructure may have separate pricing rules.
Token estimates based on characters are useful for planning, but production budgets should use actual token counts from request logs or the API.
Provider rows can differ. A Gemini model through Google AI Studio, Vertex AI, OpenRouter, or another provider may have different limits, prices, and terms.

Official references to verify

Use these sources before deploying a budget-sensitive Gemini workload.

Gemini Developer API pricing for current model rates, free tier notes, caching, and grounding charges.
Gemini token counting guide for token estimation and countTokens guidance.
Gemini API rate limits for quota and traffic planning.

Gemini API pricing FAQ

Multiply input tokens by the model input price, output tokens by the output price, and then add feature-specific costs such as Search grounding or cache storage when used. For monthly spend, multiply the per-request estimate by expected request volume.

Some Gemini models or tiers may include free usage, but free tier access is constrained by rate limits, region, product tier, and feature availability. Treat free rows as testing capacity, not as a production budget guarantee.

Google pricing tables often separate text/image/video input from audio input. If your app processes calls, meetings, or voice notes, add an audio input share instead of using a text-only estimate.

Caching helps when many requests reuse the same long prompt, document, policy, or tool context. It is less useful for one-off prompts, and explicit cache storage can add cost if the cached context is large or kept for too long.

Use batch pricing for offline work that can wait, such as nightly extraction, evaluation, tagging, or backfills. Do not route latency-sensitive chat turns through batch just to reduce token cost.

It depends on the task. Gemini Flash rows can be cost-effective for high-volume and multimodal workloads, but OpenAI, Claude, GPT-OSS, or other models may be cheaper after quality, retries, output length, and routing constraints are included.

Gemini API Pricing Calculator for Google AI Costs

Gemini API cost is usually driven by output length, audio input, cache strategy, and grounding.

Gemini API cost estimator

Selected row

How to use the Gemini API pricing calculator

1. Pick the model row

2. Estimate tokens

3. Add real modifiers

Live Gemini API pricing rows

Gemini pricing factors to check before production

Example Gemini API cost scenarios

Support chatbot

Document extraction

Voice or meeting analysis

Grounded research assistant

Alternatives to compare with Gemini pricing

Limitations and billing notes

Official references to verify

Gemini API pricing FAQ

How do I calculate Gemini API cost?

Is Gemini API free?

Does Gemini charge more for audio?

When does Gemini context caching save money?

Should I use Gemini batch pricing?

Is Gemini cheaper than OpenAI or Claude?