Google DeepMind

Gemini Context Cost Calculator

Gemini 2.5 Flash offers a 1M token context window at the same price as GPT-4o-mini. Calculate your Google Gemini API costs at scale.

LLM Context Cost Calculator

Enter your usage parameters below

Number of tokens in your prompt per request

Number of tokens in the model response

Average number of API calls per day

$2.5/M input · $10/M output · 128k context

Cost Per Request

$0.022

GPT-4o - OpenAI

Daily Cost

$2.25

Monthly Cost

$67.50

Annual Cost

$810.00

Cost Per 1K Tokens

$0.0037

Blended input + output rate

Input vs Output Split

56% / 44%

Input cost vs output cost

Cheapest Alternative

GPT-4.1-nano

$2.70/mo - save 96% ($64.80/mo)

OpenAI · budget

Optimize your LLM spend

Platforms like OpenRouter, Together AI, and Groq offer competitive pricing and unified APIs across multiple models. Fireworks AI and Anyscale specialize in high-throughput inference at lower cost.

OpenRouterTogether AIGroqFireworks AIAnyscale

Need help optimizing LLM costs?

Digital Signet builds AI-powered systems and provides fractional CTO leadership. 20+ years shipping software.

This costs you ~$810/year

We'll identify the top 3 drivers and give you a 90-day mitigation plan.

Get a Free Exposure Teardown →

Or email Oliver directly → [email protected]

Gemini Model Pricing

ModelInput $/1MOutput $/1MContextCategoryNotes
Gemini 2.5 Pro$1.25 / $2.50*$10.001Mflagship*>200K tokens
Gemini 2.5 Flash$0.15$0.601MfastBest value/context

1M Context Window Advantage

Gemini's 1M token context window is the largest available from a major provider. At $0.15/1M input tokens, Gemini 2.5 Flash is the most cost-effective model for large-context tasks like full codebase analysis or long document processing.

Frequently Asked Questions

How much does the Gemini API cost?

Gemini 2.5 Pro costs $1.25 per 1M input tokens (up to 200K) and $10.00 per 1M output tokens. Gemini 2.5 Flash is much cheaper at $0.15/1M input and $0.60/1M output - matching GPT-4o-mini pricing but with a 1M token context window. Both models offer a free tier through Google AI Studio for development and testing.

What is Gemini's context window size?

Both Gemini 2.5 Pro and Gemini 2.5 Flash support a 1,000,000 (1M) token context window - the largest among major frontier model providers alongside GPT-4.1. This enables processing of entire codebases, long legal documents, or hours of meeting transcripts in a single API call. Note that Gemini 2.5 Pro charges $2.50/1M for prompts exceeding 200K tokens.

How does Gemini compare to GPT-4 and Claude on cost?

Gemini 2.5 Pro at $1.25/1M input is cheaper than both GPT-4.1 ($2.00) and Claude Sonnet 4 ($3.00), with a 1M token context window. Gemini 2.5 Flash at $0.15/1M matches GPT-4o-mini pricing with significantly larger context. For cost-sensitive applications requiring large context windows, Gemini 2.5 Flash offers exceptional value.

Does Google Gemini offer a free tier?

Yes. Google AI Studio offers free access to Gemini models for development and testing, with rate limits. Gemini 2.5 Flash has a free tier of 1,500 requests per day and 1M tokens per minute. Production applications use the Vertex AI API or Google AI API with pay-as-you-go pricing. Free tier requests may be used to improve Google's models.