Argus multi-model code review

Argus

Multi-model code review, in parallel.

Argus dispatches a git diff, PR, or file set to a roster of frontier LLM reviewers at the same time, filters their findings by confidence with cross-reviewer corroboration, and merges everything into a single, de-duplicated review. One model misses bugs; two produce noise — a filtered panel of six beats any single reviewer.

Named for Argus Panoptes, the hundred-eyed giant of Greek myth. The original Linus’s Law: given enough eyeballs, all bugs are shallow. Argus is a Claude Code skill — invoke it as /argus — backed by a small set of standalone Python scripts.

15
Reviewers in the registry
8
Curated profiles
≥2
Reviewers to corroborate
80
Confidence threshold

Why a panel?

1

One model misses bugs

Even frontier models have blind spots. A single reviewer’s recall is capped by its own biases.

2

Two produce noise

Add a second voice and you get more coverage — but also more false positives competing for attention.

6

Six, filtered, win

A panel filtered by a confidence threshold and a corroboration boost surfaces real issues and suppresses the noise.

What you get

Frontier roster

GLM‑5.2, MiniMax M3, DeepSeek V4 Pro, Kimi, Qwen, Grok, plus Codex / Claude / OpenCode CLIs.

Routing preference

One knob picks OpenRouter-first (public default) or your own direct-API subscriptions first.

Corroboration merge

Confidence threshold + a +15 boost when reviewers agree, with ±3-line clustering.

Benchmark mode

Score reviewers against labeled fixtures (precision / recall / F1) to build a leaderboard.

Profiles

quick, standard, panel, security, deep, favorites, direct, leaderboard‑top5.

Privacy-aware

API keys live only in your environment — never written to disk by Argus.

Quick start

Inside Claude Code, the simplest invocation reviews your current diff with the default standard profile:

/argus
/argus --profile security
/argus --custom "glm-5.2,deepseek-v4-pro,claude"
/argus --pr https://github.com/owner/repo/pull/123
/argus --benchmark --runs 3
/argus --dry-run

Prefer the shell? The end-to-end pipeline is three scripts — estimate, dispatch, merge:

RUN_DIR="$ARGUS_HOME/runs/$(date +%Y%m%dT%H%M%S)-manual"; mkdir -p "$RUN_DIR"
git diff HEAD > "$RUN_DIR/diff.patch"
python scripts/estimate_cost.py --roster "glm-5.2,minimax-m3,gemini-or,codex" --diff "$RUN_DIR/diff.patch"
python scripts/dispatch.py --run-dir "$RUN_DIR" --roster "glm-5.2,minimax-m3,gemini-or,codex" --diff "$RUN_DIR/diff.patch"
python scripts/merge.py --run-dir "$RUN_DIR"
💡
New here?

Head to Getting Started for a beginner track (one API key, five commands) and an advanced track (direct-API routing, profiles, parallel benchmarks).