Search and summary workload
80M input / 12M output / 45% cache hit
V4 Flash
$10.53
V4 Pro
$123.54
Pricing intelligence
DeepSeek prices are listed per 1M tokens in USD. Start with V4 Flash: $0.14 input, $0.028 cache-hit input, and $0.28 output. V4 Pro remains documented as an escalation route at $0.145 cache hit, $1.74 input, and $3.48 output. The $12/M output value belongs to Gemini 3.1 Pro Preview, not DeepSeek V4 Pro.
| Model | Input / 1M | Cached input | Output / 1M | Context | Positioning |
|---|---|---|---|---|---|
DeepSeek V4 Flash Cost baseline | $0.14 | $0.028 | $0.28 | 1M | Primary DSFlashHub route for OpenClaw adaptation, search, summarization, batch code assistance, and routine agent traffic. DeepSeek API Docs |
DeepSeek V4 Pro Frontier | $1.74 | $0.145 | $3.48 | 1M | Official V4 Pro rate: $0.145 cache hit, $1.74 cache miss input, $3.48 output per 1M tokens. Use it as an escalation route, not the default story. DeepSeek API Docs |
OpenAI GPT-5.4 Frontier | $2.5 | $0.25 | $15 | Short / long tiers | Strong closed-model quality baseline. Use it when you need broad ecosystem compatibility or client-requested GPT output. OpenAI Pricing |
Anthropic Claude Opus 4.7 / 4.6 Enterprise | $5 | $0.5 | $25 | 1M | Enterprise-grade review, analysis, and code-agent reference route. Expensive enough to reserve for high-value tasks. Anthropic Pricing |
Anthropic Claude Sonnet 4.6 Enterprise | $3 | $0.3 | $15 | 1M | Claude's more practical default route when a team wants Anthropic behavior without Opus-level spend. Anthropic Pricing |
Google Gemini 3.1 Pro Preview Multimodal | $2 | $0.2 | $12 | 1M | $12/M is Gemini output pricing, not DeepSeek V4 Pro pricing. Best used as a multimodal and Google-ecosystem comparison point. Google AI Pricing |
xAI Grok 4.20 Realtime | $2 | N/A | $6 | 2M | Useful reference for realtime, agentic, and large-context Grok workflows. xAI pricing tables are dynamic, so verify in the xAI console before launch. xAI Models and Pricing |
Cost simulator
Selected model
$16.24
V4 Flash route
$16.24
Flash saving
0%
Budget routes
These examples compare Flash and Pro with fixed token volumes. Real systems also need retry cost, tool-call cost, safety filters, and cache hit assumptions.
80M input / 12M output / 45% cache hit
V4 Flash
$10.53
V4 Pro
$123.54
300M input / 90M output / 25% cache hit
V4 Flash
$58.8
V4 Pro
$715.57
900M input / 160M output / 60% cache hit
V4 Flash
$110.32
V4 Pro
$1,261.5
Use V4 Flash for 70% to 90% of routine traffic. Upgrade hard, high-value, low-frequency requests to V4 Pro. OpenClaw agent traffic should start on Flash, with GPT, Claude, Gemini, or Grok kept as explicit comparison or fallback routes.
Provider pricing can change. Before production launch, re-check official pricing, rate limits, region availability, and model aliases, then keep those values configurable.