Model Router¶
Don't memorize CLI commands for each model. Teach your agent to route tasks to the right model automatically.
The Model Router is an intelligent dispatch layer for multi-model Agent Teams. It maintains a registry of model capabilities, matches sub-tasks to the best model, generates CLI commands in the correct protocol (claude/codex/gemini), and learns from dispatch history via the perception feedback loop.
How It Works¶
graph LR
A[Sub-task] --> B[Model Router]
B --> C{Match Task Type}
C -->|code-review| D[Claude Opus]
C -->|implementation| E[DeepSeek-V4]
C -->|browser| F[GPT-5.5]
C -->|research| G[Gemini-3-Pro]
C -->|general| H[GPT-5.5]
D --> I[CLI Command]
E --> I
F --> I
G --> I
H --> I
I --> J[Execute & Record Outcome]
- Analyze — the agent reads the sub-task and matches it to a task type (code review, implementation, research, etc.)
- Route — the Model Router selects the primary model by capability match, with a cost-ascending fallback chain
- Dispatch — a CLI command is generated in the correct protocol based on model provider
- Learn — dispatch outcomes are recorded to the perception layer; future routing considers historical success rates
Agent Team Runtime¶
aios team and aios orchestrate --dispatch local --execute live now apply the Model Router per phase by default instead of using one outer worker client for every role.
- Phase jobs expose
launchSpec.requiresModel=trueandlaunchSpec.modelRoutingwithrole,taskType,modelId,provider,clientId,cliCommand,reason, andfallback. - Merge gates stay deterministic control jobs with
requiresModel=false. - Live subagent and GroupChat workers switch to the routed CLI client (
codex-cli,claude-code, orgemini-cli) and append the correct model argument for that protocol. - Worker prompts include a
## Model Routersection so the selected model/protocol is visible in prompt logs and handoffs. - Each phase or speaker writes a ContextDB
kind=model.dispatchevent withturn.environment=model-router; refs include the routed model, task type, and role formodel-router stats.
Disable execution-time CLI switching when you need a fixed worker client:
AIOS_MODEL_ROUTER=0 aios team "implement the feature"
# also accepted: false, off, no
Dry-runs still include planned routing metadata where safe, so previews remain auditable without invoking models.
Model Capability Registry¶
The registry (memory/specs/model-registry.json) defines 8 models with structured capabilities:
| Model | Provider | Strengths | Cost |
|---|---|---|---|
| Claude Opus 4.7 | claude | Code review, architecture, security audit | Highest |
| Claude Sonnet 4.6 | claude | Daily dev, RAG, rapid prototyping | Medium |
| GPT-5.5 | codex | All-rounder: automation, reasoning, code execution | Highest |
| DeepSeek-V4-Pro | claude | Algorithm, core logic, batch processing | Lowest |
| GLM-5.1 | claude | Math reasoning, autonomous loops, planning | Low |
| Kimi K2.6 | claude | Multi-agent orchestration, frontend UI, long execution | Low |
| MiniMax-M2.7 | claude | Self-healing, production recovery | Low |
| Gemini-3-Pro | gemini | Multimodal analysis, long-doc research, 1M context | Medium |
CLI Protocol¶
Three protocols, automatically selected by provider:
| Protocol | CLI | Used By |
|---|---|---|
| codex | codex --yolo -m <model> -p "<prompt>" |
GPT-5.5 |
| gemini | gemini -m gemini-3-pro -p "<prompt>" |
Gemini-3-Pro |
| claude | claude --model <model> -p "<prompt>" |
All other models |
Routing Rules¶
| Task Type | Primary | Fallback Chain |
|---|---|---|
| code-review | Claude Opus | GPT-5.5 → GLM-5.1 |
| security-review | Claude Opus | GPT-5.5 → GLM-5.1 |
| architecture | Claude Opus | GPT-5.5 → GLM-5.1 |
| implementation | DeepSeek-V4 | GPT-5.5 → Claude Sonnet |
| browser-automation | GPT-5.5 | Kimi K2.6 → Claude Sonnet |
| research | Gemini-3-Pro | GPT-5.5 → Kimi K2.6 |
| planning | GLM-5.1 | GPT-5.5 → Claude Opus |
| testing | Claude Sonnet | GPT-5.5 → DeepSeek-V4 |
| docs | Claude Sonnet | GPT-5.5 → Kimi K2.6 |
| frontend | Kimi K2.6 | GPT-5.5 → Claude Sonnet |
| self-healing | MiniMax-M2.7 | GLM-5.1 → GPT-5.5 |
| general | GPT-5.5 | Claude Sonnet → DeepSeek-V4 |
Quick Start¶
View the Model Registry¶
node scripts/aios.mjs model-router list
Route a Task to the Best Model¶
# Auto-detect task type from description
node scripts/aios.mjs model-router route --task "审查 auth.js 的安全漏洞"
# Explicit task type
node scripts/aios.mjs model-router route --task "重构数据库连接" --task-type implementation
View Dispatch Statistics¶
node scripts/aios.mjs model-router stats
Environment Variable Overrides¶
Override model selection per role without changing config files:
export AIOS_MODEL_PLANNER=claude-opus
export AIOS_MODEL_IMPLEMENTER=deepseek-v4
export AIOS_MODEL_REVIEWER=claude-opus
export AIOS_MODEL_SECURITY_REVIEWER=claude-opus
Disable live execution-time CLI switching while keeping routing metadata in previews/reports:
export AIOS_MODEL_ROUTER=0
Or override by task type:
export AIOS_MODEL_CODE_REVIEW=claude-opus
export AIOS_MODEL_RESEARCH=gemini-3-pro
export AIOS_MODEL_GENERAL=gpt-5.5
Agent Integration¶
Via Task Router Guide¶
The Model Router is injected into the agent's context via the AIOS Task Router. Any agent running through ctx-agent automatically receives model routing guidance. When dispatching sub-tasks, the agent can invoke the model-router skill to determine the optimal model.
Via Orchestrator¶
Agent Team dispatch resolves model routing from role defaults and environment overrides. The default role mapping is:
| Role | Task Type | Default Primary |
|---|---|---|
| planner | planning | GLM-5.1 |
| implementer | implementation | DeepSeek-V4 |
| reviewer | code-review | Claude Opus |
| security-reviewer | security-review | Claude Opus |
The effective model is resolved via: role env var > task-type env var > routing rule primary > fallback chain.
Agent role cards (.claude/agents/*.md) may also include a preferredModel field for compatibility with older orchestrator flows:
# .claude/agents/rex-reviewer.md
model: sonnet
preferredModel: claude-opus
In live team execution, the launchSpec.modelRouting.clientId decides the CLI protocol unless AIOS_MODEL_ROUTER=0 disables execution-time override.
Perception Feedback Loop¶
Every model dispatch is recorded as a model.dispatch event in ContextDB:
{
"kind": "model.dispatch",
"modelId": "claude-opus",
"taskType": "code-review",
"success": true,
"latencyMs": 4500,
"costEstimate": "high"
}
The perception system can compute model success rates per task type. Future routing decisions can weight: capability match × historical success rate × cost.
Configuration Files¶
| File | Purpose |
|---|---|
memory/specs/model-registry.json |
Model capabilities, routing rules, CLI protocol config |
memory/specs/orchestrator-agents.json |
Agent role → preferredModel mapping (schema v2) |
.claude/skills/model-router/SKILL.md |
Agent-callable skill for self-service routing |
.claude/agents/*.md |
Agent role cards with preferredModel frontmatter |
scripts/lib/model-router.mjs |
Router logic: matching, fallback, CLI building, stats |