← All patterns

Model Fallback Chain

Cross-cutting

The Problem

Hardcoding a single model creates a single point of failure. Rate limits, outages, and cost overruns will halt your pipeline. In production, you need resilience: if the primary model can't handle a request, route to a fallback transparently.

The Pattern

Task submitted
      │
      ▼
Primary model available?
      │
  ┌───┴────┐
  ▼        ▼
 Yes       No (rate limit / outage / over budget)
  │         │
  ▼         ▼
Execute   Try fallback #1
            │
        ┌───┴────┐
        ▼        ▼
      Works    No
        │       │
        ▼       ▼
     Execute  Try fallback #2 (or fail gracefully)

When to Use

When NOT to Use

Code Example

import { QueueAI } from "@queueai/sdk"

const queue = new QueueAI({ apiKey: process.env.QUEUEAI_API_KEY })

const task = await queue.submit({
  name: "deep-research",
  input: { question: "Summarize recent advances in vector databases." },
  model: {
    primary: "anthropic/claude-opus-4-8",
    fallback: "anthropic/claude-sonnet-4-6",
  },
  budget: { maxCostUsd: 1.00 },
})
from queueai import QueueAI

queue = QueueAI(api_key=os.environ["QUEUEAI_API_KEY"])

task = queue.submit(
    name="deep-research",
    input={"question": "Summarize recent advances in vector databases."},
    model={
        "primary": "anthropic/claude-opus-4-8",
        "fallback": "anthropic/claude-sonnet-4-6",
    },
    budget={"max_cost_usd": 1.00},
)

See it in action: The model.fallback field is available on every QueueAI task.