Adversarial Verification
Codebase ReviewThe Problem
LLM agents are prone to confident hallucination — they produce findings that sound credible but are wrong, incomplete, or miscontextualized. A single reviewer can't catch its own blind spots. If you trust first-pass findings directly, you ship false positives.
The Pattern
Finding
│
├──────────────────┐
▼ ▼
[Skeptic 1] [Skeptic 2] (each tries to REFUTE the finding)
│ │
└────────┬─────────┘
▼
┌──────────┐
│ Verdict │ (majority vote: ≥2 of 3 must NOT refute)
└──────────┘
│
┌──────┴──────┐
▼ ▼
Confirmed Rejected
When to Use
- Code review, security analysis, or any task where false positives are costly
- After a fan-out phase where you need to filter noise from signal
- When a finding being wrong has meaningful consequences
When NOT to Use
- Tasks where speed matters more than accuracy
- Obvious findings where adversarial review adds cost without value
- When the finding is inherently subjective (skeptics won't converge)
Code Example
import { QueueAI } from "@queueai/sdk"
const queue = new QueueAI({ apiKey: process.env.QUEUEAI_API_KEY })
const task = await queue.submit({
name: "codebase-review",
input: {
repoUrl: "https://github.com/your-org/your-repo",
focus: "security",
verification: "adversarial",
verifierCount: 3,
},
model: { primary: "anthropic/claude-opus-4-8" },
budget: { maxCostUsd: 5.00 },
})
from queueai import QueueAI
queue = QueueAI(api_key=os.environ["QUEUEAI_API_KEY"])
task = queue.submit(
name="codebase-review",
input={
"repo_url": "https://github.com/your-org/your-repo",
"focus": "security",
"verification": "adversarial",
"verifier_count": 3,
},
model={"primary": "anthropic/claude-opus-4-8"},
budget={"max_cost_usd": 5.00},
)
See it in action: This pattern powers the Codebase Review template in QueueAI.