API routing

DeepSeek V4 Flash API routing playbook

Route by job type, not by brand preference. V4 Flash handles default OpenClaw and API traffic; V4 Pro and other providers enter only when a clear trigger is met.

Routing table

Default model, escalation, and metric

This page adds a crawlable operational table that turns the Flash-first story into a concrete policy.

Task	Default	Escalation	Metric
OpenClaw routine agent turns	DeepSeek V4 Flash	V4 Pro after repeated failed checks	Task completion cost, latency, retry rate
Search and retrieval answers	DeepSeek V4 Flash	Claude/GPT sample audit for regulated pages	Citation quality, hallucination rate, cache hit ratio
Code explanation and batch fixes	DeepSeek V4 Flash	V4 Pro for complex debugging plans	Accepted patch rate, follow-up turns, token cost
Long-form strategy or policy review	V4 Pro or Claude/GPT audit	Human review	Review accuracy, missed risks, edit distance

OpenClaw

Adapter-aware routing rules

OpenClaw needs dedicated model-routing language because agent traffic repeats context, tools, and planning scaffolds.

Default model route

Map routine OpenClaw chat, planning, and tool narration to deepseek-v4-flash.

This keeps high-frequency agent turns on the lowest DeepSeek V4 route while preserving the same model-family positioning.

Escalation policy

Escalate to deepseek-v4-pro only for long reasoning, failed self-checks, or high-risk review.

Pro should improve difficult tasks without turning every OpenClaw request into a premium request.

Prompt cache design

Keep OpenClaw system prompts, tool schemas, and retrieval wrappers stable across runs.

Stable prompt scaffolding is the direct path to cache-hit economics on repeated agent workflows.

Comparison fallback

Keep GPT, Claude, Gemini, and Grok as explicit audit or customer-required routes.

The comparison layer should support routing decisions without weakening the Flash-first message.