Flash first

DeepSeek V4 Flash is the main route

V4 Flash should carry the DSFlashHub story: low input cost, very low cache-hit input cost, 1M context, and a clean default path for OpenClaw agent workflows. V4 Pro stays available, but only as an escalation model.

Price signal

Flash economics are the headline

The pricing story should start with V4 Flash: $0.14/M input, $0.028/M cache-hit input, and $0.28/M output.

Flash is the primary route

$0.14 / $0.28

Use V4 Flash as the default model for OpenClaw agent traffic, search, summaries, retrieval responses, and high-volume API jobs.

Cache hits matter

$0.028

Cache-hit input pricing makes repeated OpenClaw planning context and retrieval scaffolds much cheaper when prompts are structured consistently.

Pro is escalation only

$1.74 / $3.48

V4 Pro remains useful, but DSFlashHub should present it as a hard-task fallback rather than the lead product route.

Model details

What the Flash page should emphasize

Keep the model specification factual, but make the editorial emphasis clear: Flash is the default, Pro is the exception.

API model IDs

deepseek-v4-flash / deepseek-v4-pro

Use deepseek-v4-flash as the default model ID in the DSFlashHub content stack; reserve deepseek-v4-pro for explicit escalation.

Context length

1M tokens

V4 Flash and V4 Pro both publish a 1M-token context window.

Max output

384K tokens

The model card lists a 384K maximum output length.

Reasoning modes

Non-think / Think / Think Max

Both Flash and Pro expose non-thinking and thinking modes; Think Max is reserved for the hardest reasoning cases.

Open weights

MIT licensed weights

The Hugging Face model card lists open weights and links to the V4 technical report.

Architecture

MoE + hybrid attention

V4 Pro is listed as 1.6T total / 49B active; V4 Flash is listed as 284B total / 13B active.

Routing

Default V4 Flash routes

Current Flash row: Primary DSFlashHub route for OpenClaw adaptation, search, summarization, batch code assistance, and routine agent traffic.

TaskDefault modelEscalationMetric
OpenClaw routine agent turnsDeepSeek V4 FlashV4 Pro after repeated failed checksTask completion cost, latency, retry rate
Search and retrieval answersDeepSeek V4 FlashClaude/GPT sample audit for regulated pagesCitation quality, hallucination rate, cache hit ratio
Code explanation and batch fixesDeepSeek V4 FlashV4 Pro for complex debugging plansAccepted patch rate, follow-up turns, token cost
Long-form strategy or policy reviewV4 Pro or Claude/GPT auditHuman reviewReview accuracy, missed risks, edit distance

Full table

Flash still sits inside the broader comparison

The site can compare GPT, Claude, Gemini, and Grok, but the DeepSeek column should lead with V4 Flash.

ModelInput / 1MCached inputOutput / 1MContextPositioning
DeepSeek V4 Flash
Cost baseline
$0.14$0.028$0.281M

Primary DSFlashHub route for OpenClaw adaptation, search, summarization, batch code assistance, and routine agent traffic.

DeepSeek API Docs

DeepSeek V4 Pro
Frontier
$1.74$0.145$3.481M

Official V4 Pro rate: $0.145 cache hit, $1.74 cache miss input, $3.48 output per 1M tokens. Use it as an escalation route, not the default story.

DeepSeek API Docs

OpenAI GPT-5.4
Frontier
$2.5$0.25$15Short / long tiers

Strong closed-model quality baseline. Use it when you need broad ecosystem compatibility or client-requested GPT output.

OpenAI Pricing

Anthropic Claude Opus 4.7 / 4.6
Enterprise
$5$0.5$251M

Enterprise-grade review, analysis, and code-agent reference route. Expensive enough to reserve for high-value tasks.

Anthropic Pricing

Anthropic Claude Sonnet 4.6
Enterprise
$3$0.3$151M

Claude's more practical default route when a team wants Anthropic behavior without Opus-level spend.

Anthropic Pricing

Google Gemini 3.1 Pro Preview
Multimodal
$2$0.2$121M

$12/M is Gemini output pricing, not DeepSeek V4 Pro pricing. Best used as a multimodal and Google-ecosystem comparison point.

Google AI Pricing

xAI Grok 4.20
Realtime
$2N/A$62M

Useful reference for realtime, agentic, and large-context Grok workflows. xAI pricing tables are dynamic, so verify in the xAI console before launch.

xAI Models and Pricing