API routing

DeepSeek V4 Flash API routing playbook

Route by job type, not by brand preference. V4 Flash handles default OpenClaw and API traffic; V4 Pro and other providers enter only when a clear trigger is met.

Routing table

Default model, escalation, and metric

This page adds a crawlable operational table that turns the Flash-first story into a concrete policy.

TaskDefaultEscalationMetric
OpenClaw routine agent turnsDeepSeek V4 FlashV4 Pro after repeated failed checksTask completion cost, latency, retry rate
Search and retrieval answersDeepSeek V4 FlashClaude/GPT sample audit for regulated pagesCitation quality, hallucination rate, cache hit ratio
Code explanation and batch fixesDeepSeek V4 FlashV4 Pro for complex debugging plansAccepted patch rate, follow-up turns, token cost
Long-form strategy or policy reviewV4 Pro or Claude/GPT auditHuman reviewReview accuracy, missed risks, edit distance

OpenClaw

Adapter-aware routing rules

OpenClaw needs dedicated model-routing language because agent traffic repeats context, tools, and planning scaffolds.

Default model route

Map routine OpenClaw chat, planning, and tool narration to deepseek-v4-flash.

This keeps high-frequency agent turns on the lowest DeepSeek V4 route while preserving the same model-family positioning.

Escalation policy

Escalate to deepseek-v4-pro only for long reasoning, failed self-checks, or high-risk review.

Pro should improve difficult tasks without turning every OpenClaw request into a premium request.

Prompt cache design

Keep OpenClaw system prompts, tool schemas, and retrieval wrappers stable across runs.

Stable prompt scaffolding is the direct path to cache-hit economics on repeated agent workflows.

Comparison fallback

Keep GPT, Claude, Gemini, and Grok as explicit audit or customer-required routes.

The comparison layer should support routing decisions without weakening the Flash-first message.