Pricing intelligence

DeepSeek V4 Flash pricing is the lead story

DeepSeek prices are listed per 1M tokens in USD. Start with V4 Flash: $0.14 input, $0.028 cache-hit input, and $0.28 output. V4 Pro remains documented as an escalation route at $0.145 cache hit, $1.74 input, and $3.48 output. The $12/M output value belongs to Gemini 3.1 Pro Preview, not DeepSeek V4 Pro.

Model	Input / 1M	Cached input	Output / 1M	Context	Positioning
DeepSeek V4 Flash Cost baseline	$0.14	$0.028	$0.28	1M	Primary DSFlashHub route for OpenClaw adaptation, search, summarization, batch code assistance, and routine agent traffic. DeepSeek API Docs
DeepSeek V4 Pro Frontier	$1.74	$0.145	$3.48	1M	Official V4 Pro rate: $0.145 cache hit, $1.74 cache miss input, $3.48 output per 1M tokens. Use it as an escalation route, not the default story. DeepSeek API Docs
OpenAI GPT-5.4 Frontier	$2.5	$0.25	$15	Short / long tiers	Strong closed-model quality baseline. Use it when you need broad ecosystem compatibility or client-requested GPT output. OpenAI Pricing
Anthropic Claude Opus 4.7 / 4.6 Enterprise	$5	$0.5	$25	1M	Enterprise-grade review, analysis, and code-agent reference route. Expensive enough to reserve for high-value tasks. Anthropic Pricing
Anthropic Claude Sonnet 4.6 Enterprise	$3	$0.3	$15	1M	Claude's more practical default route when a team wants Anthropic behavior without Opus-level spend. Anthropic Pricing
Google Gemini 3.1 Pro Preview Multimodal	$2	$0.2	$12	1M	$12/M is Gemini output pricing, not DeepSeek V4 Pro pricing. Best used as a multimodal and Google-ecosystem comparison point. Google AI Pricing
xAI Grok 4.20 Realtime	$2	N/A	$6	2M	Useful reference for realtime, agentic, and large-context Grok workflows. xAI pricing tables are dynamic, so verify in the xAI console before launch. xAI Models and Pricing

Cost simulator

Monthly token budget simulator

ModelInput tokens / MOutput tokens / MCache hit

Selected model

$16.24

V4 Flash route

$16.24

Flash saving

Budget routes

Three practical monthly budget scenarios

These examples compare Flash and Pro with fixed token volumes. Real systems also need retry cost, tool-call cost, safety filters, and cache hit assumptions.

Search and summary workload

80M input / 12M output / 45% cache hit

V4 Flash

$10.53

V4 Pro

$123.54

Code-agent monthly traffic

300M input / 90M output / 25% cache hit

V4 Flash

$58.8

V4 Pro

$715.57

Enterprise long-context knowledge base

900M input / 160M output / 60% cache hit

V4 Flash

$110.32

V4 Pro

$1,261.5

Recommended routing policy

Use V4 Flash for 70% to 90% of routine traffic. Upgrade hard, high-value, low-frequency requests to V4 Pro. OpenClaw agent traffic should start on Flash, with GPT, Claude, Gemini, or Grok kept as explicit comparison or fallback routes.

Pricing pages are not contracts

Provider pricing can change. Before production launch, re-check official pricing, rate limits, region availability, and model aliases, then keep those values configurable.