Model Fallback Chain
Cross-cuttingThe Problem
Hardcoding a single model creates a single point of failure. Rate limits, outages, and cost overruns will halt your pipeline. In production, you need resilience: if the primary model can't handle a request, route to a fallback transparently.
The Pattern
Task submitted
│
▼
Primary model available?
│
┌───┴────┐
▼ ▼
Yes No (rate limit / outage / over budget)
│ │
▼ ▼
Execute Try fallback #1
│
┌───┴────┐
▼ ▼
Works No
│ │
▼ ▼
Execute Try fallback #2 (or fail gracefully)
When to Use
- Any production task where availability matters
- High-cost tasks where you want to downgrade model if budget is tight
- Multi-provider setups where you want provider redundancy
When NOT to Use
- Development/testing where you want deterministic model selection
- Tasks where output quality differs so significantly between models that a fallback would be misleading
Code Example
import { QueueAI } from "@queueai/sdk"
const queue = new QueueAI({ apiKey: process.env.QUEUEAI_API_KEY })
const task = await queue.submit({
name: "deep-research",
input: { question: "Summarize recent advances in vector databases." },
model: {
primary: "anthropic/claude-opus-4-8",
fallback: "anthropic/claude-sonnet-4-6",
},
budget: { maxCostUsd: 1.00 },
})
from queueai import QueueAI
queue = QueueAI(api_key=os.environ["QUEUEAI_API_KEY"])
task = queue.submit(
name="deep-research",
input={"question": "Summarize recent advances in vector databases."},
model={
"primary": "anthropic/claude-opus-4-8",
"fallback": "anthropic/claude-sonnet-4-6",
},
budget={"max_cost_usd": 1.00},
)
See it in action: The
model.fallbackfield is available on every QueueAI task.