Pricing intelligence

DeepSeek V4 Flash pricing is the lead story

DeepSeek prices are listed per 1M tokens in USD. Start with V4 Flash: $0.14 input, $0.028 cache-hit input, and $0.28 output. V4 Pro remains documented as an escalation route at $0.145 cache hit, $1.74 input, and $3.48 output. The $12/M output value belongs to Gemini 3.1 Pro Preview, not DeepSeek V4 Pro.

ModelInput / 1MCached inputOutput / 1MContextPositioning
DeepSeek V4 Flash
Cost baseline
$0.14$0.028$0.281M

Primary DSFlashHub route for OpenClaw adaptation, search, summarization, batch code assistance, and routine agent traffic.

DeepSeek API Docs

DeepSeek V4 Pro
Frontier
$1.74$0.145$3.481M

Official V4 Pro rate: $0.145 cache hit, $1.74 cache miss input, $3.48 output per 1M tokens. Use it as an escalation route, not the default story.

DeepSeek API Docs

OpenAI GPT-5.4
Frontier
$2.5$0.25$15Short / long tiers

Strong closed-model quality baseline. Use it when you need broad ecosystem compatibility or client-requested GPT output.

OpenAI Pricing

Anthropic Claude Opus 4.7 / 4.6
Enterprise
$5$0.5$251M

Enterprise-grade review, analysis, and code-agent reference route. Expensive enough to reserve for high-value tasks.

Anthropic Pricing

Anthropic Claude Sonnet 4.6
Enterprise
$3$0.3$151M

Claude's more practical default route when a team wants Anthropic behavior without Opus-level spend.

Anthropic Pricing

Google Gemini 3.1 Pro Preview
Multimodal
$2$0.2$121M

$12/M is Gemini output pricing, not DeepSeek V4 Pro pricing. Best used as a multimodal and Google-ecosystem comparison point.

Google AI Pricing

xAI Grok 4.20
Realtime
$2N/A$62M

Useful reference for realtime, agentic, and large-context Grok workflows. xAI pricing tables are dynamic, so verify in the xAI console before launch.

xAI Models and Pricing

Cost simulator

Monthly token budget simulator

Selected model

$16.24

V4 Flash route

$16.24

Flash saving

0%

Budget routes

Three practical monthly budget scenarios

These examples compare Flash and Pro with fixed token volumes. Real systems also need retry cost, tool-call cost, safety filters, and cache hit assumptions.

Search and summary workload

80M input / 12M output / 45% cache hit

V4 Flash

$10.53

V4 Pro

$123.54

Code-agent monthly traffic

300M input / 90M output / 25% cache hit

V4 Flash

$58.8

V4 Pro

$715.57

Enterprise long-context knowledge base

900M input / 160M output / 60% cache hit

V4 Flash

$110.32

V4 Pro

$1,261.5

Recommended routing policy

Use V4 Flash for 70% to 90% of routine traffic. Upgrade hard, high-value, low-frequency requests to V4 Pro. OpenClaw agent traffic should start on Flash, with GPT, Claude, Gemini, or Grok kept as explicit comparison or fallback routes.

Pricing pages are not contracts

Provider pricing can change. Before production launch, re-check official pricing, rate limits, region availability, and model aliases, then keep those values configurable.