zscole/adversarial-spec

A Claude Code plugin that iteratively refines product specifications by debating between multiple LLMs until all models reach consensus.

License:MITLanguage:Python40838
anthropicclaude-aiclaude-codeclaude-code-pluginclaude-skillsllm编排

Deep Analysis

通过多模型辩论机制迭代完善产品规范文档,直至所有模型达成一致共识。

Recommended

Core Features

多模型辩论

多个LLM并行评审规范,相互质证,Claude主动参与

访谈模式

前期深度需求采集,系统化捕获功能、约束和风险

焦点模式

安全、可扩展性、性能等6个维度的专项审视

会话持久化

自动保存状态支持断点恢复,完整保留所有轮次记录

Technical Implementation

Architecture:Python编写的多模型协调引擎,基于litellm统一调度不同API,Claude作为综合者和独立评审者
Execution Flow:
初始化

选择文档类型(PRD/技术规范),配置对手模型列表

草稿生成

Claude根据需求或现有文档生成初版规范

并行评审

多个LLM同步评审并提出批评意见

综合优化

Claude整合反馈、提供独立见解,生成修订版本

收敛循环

重复3-4步直至全部模型和Claude达成一致

Key Components:
litellm统一LLM API调用层,支持20+供应商
会话管理状态持久化和断点恢复机制
AWS Bedrock企业级合规性模型路由
Highlights
  • Claude主动参与辩论而非被动协调,提供独立评审和观点合成
  • 支持9种LLM供应商及OpenRouter、OpenAI兼容端点
  • 完整记录所有轮次、自动检查点、成本追踪、Telegram通知集成
  • 针对专业角色(安全工程师、SRE、PM等)的定制评审视角
Use Cases
  • 创业公司快速迭代PRD,通过多视角评审确保需求完整性
  • 技术团队设计微服务架构规范,从安全、性能、可靠性多维验证
  • 产品经理编制功能规范,通过UX、可访问性等角色评审提升体验
  • 企业合规团队制定系统规范,通过法规角色确保合规要求覆盖
Limitations
  • 依赖外部API调用,成本随模型复杂度和轮次线性增长
  • 多模型收敛时间不确定
Tech Stack
Python 3.10+litellmOpenAI APIGemini APIClaude APIxAI Grok APIOpenRouterAWS BedrockTelegram Bot API

adversarial-spec

A Claude Code plugin that iteratively refines product specifications through multi-model debate until consensus is reached.

Key insight: A single LLM reviewing a spec will miss things. Multiple LLMs debating a spec will catch gaps, challenge assumptions, and surface edge cases that any one model would overlook. The result is a document that has survived rigorous adversarial review.

Claude is an active participant, not just an orchestrator. Claude provides independent critiques, challenges opponent models, and contributes substantive improvements alongside external models.

Quick Start

# 1. Add the marketplace and install the plugin
claude plugin marketplace add zscole/adversarial-spec
claude plugin install adversarial-spec

# 2. Set at least one API key
export OPENAI_API_KEY="sk-..."
# Or use OpenRouter for access to multiple providers with one key
export OPENROUTER_API_KEY="sk-or-..."

# 3. Run it
/adversarial-spec "Build a rate limiter service with Redis backend"

How It Works

You describe product --> Claude drafts spec --> Multiple LLMs critique in parallel
        |                                              |
        |                                              v
        |                              Claude synthesizes + adds own critique
        |                                              |
        |                                              v
        |                              Revise and repeat until ALL agree
        |                                              |
        +--------------------------------------------->|
                                                       v
                                            User review period
                                                       |
                                                       v
                                            Final document output
  1. Describe your product concept or provide an existing document
  2. (Optional) Start with an in-depth interview to capture requirements
  3. Claude drafts the initial document (PRD or tech spec)
  4. Document is sent to opponent models (GPT, Gemini, Grok, etc.) for parallel critique
  5. Claude provides independent critique alongside opponent feedback
  6. Claude synthesizes all feedback and revises
  7. Loop continues until ALL models AND Claude agree
  8. User review period: request changes or run additional cycles
  9. Final converged document is output

Requirements

  • Python 3.10+
  • litellm package: pip install litellm
  • API key for at least one LLM provider

Supported Models

Provider Env Var Example Models
OpenAI OPENAI_API_KEY gpt-4o, gpt-4-turbo, o1
Anthropic ANTHROPIC_API_KEY claude-sonnet-4-20250514, claude-opus-4-20250514
Google GEMINI_API_KEY gemini/gemini-2.0-flash, gemini/gemini-pro
xAI XAI_API_KEY xai/grok-3, xai/grok-beta
Mistral MISTRAL_API_KEY mistral/mistral-large, mistral/codestral
Groq GROQ_API_KEY groq/llama-3.3-70b-versatile
OpenRouter OPENROUTER_API_KEY openrouter/openai/gpt-4o, openrouter/anthropic/claude-3.5-sonnet
Codex CLI ChatGPT subscription codex/gpt-5.2-codex, codex/gpt-5.1-codex-max
Gemini CLI Google account gemini-cli/gemini-3-pro-preview, gemini-cli/gemini-3-flash-preview
Deepseek DEEPSEEK_API_KEY deepseek/deepseek-chat
Zhipu ZHIPUAI_API_KEY zhipu/glm-4, zhipu/glm-4-plus

Check which keys are configured:

python3 ~/.claude/skills/adversarial-spec/scripts/debate.py providers

AWS Bedrock Support

For enterprise users who need to route all model calls through AWS Bedrock (e.g., for security compliance or inference gateway requirements):

# Enable Bedrock mode
python3 ~/.claude/skills/adversarial-spec/scripts/debate.py bedrock enable --region us-east-1

# Add models enabled in your Bedrock account
python3 ~/.claude/skills/adversarial-spec/scripts/debate.py bedrock add-model claude-3-sonnet
python3 ~/.claude/skills/adversarial-spec/scripts/debate.py bedrock add-model claude-3-haiku

# Check configuration
python3 ~/.claude/skills/adversarial-spec/scripts/debate.py bedrock status

# Disable Bedrock mode
python3 ~/.claude/skills/adversarial-spec/scripts/debate.py bedrock disable

When Bedrock is enabled, all model calls route through Bedrock - no direct API calls are made. Use friendly names like claude-3-sonnet which are automatically mapped to Bedrock model IDs.

Configuration is stored at ~/.claude/adversarial-spec/config.json.

OpenRouter Support

OpenRouter provides unified access to multiple LLM providers through a single API. This is useful for:

  • Accessing models from multiple providers with one API key
  • Comparing models across different providers
  • Automatic fallback and load balancing
  • Cost optimization across providers

Setup:

# Get your API key from https://openrouter.ai/keys
export OPENROUTER_API_KEY="sk-or-..."

# Use OpenRouter models (prefix with openrouter/)
python3 debate.py critique --models openrouter/openai/gpt-4o,openrouter/anthropic/claude-3.5-sonnet < spec.md

Popular OpenRouter models:

  • openrouter/openai/gpt-4o - GPT-4o via OpenRouter
  • openrouter/anthropic/claude-3.5-sonnet - Claude 3.5 Sonnet
  • openrouter/google/gemini-2.0-flash - Gemini 2.0 Flash
  • openrouter/meta-llama/llama-3.3-70b-instruct - Llama 3.3 70B
  • openrouter/qwen/qwen-2.5-72b-instruct - Qwen 2.5 72B

See the full model list at openrouter.ai/models.

Codex CLI Support

Codex CLI allows ChatGPT Pro subscribers to use OpenAI models without separate API credits. Models prefixed with codex/ are routed through the Codex CLI.

Setup:

# Install Codex CLI (requires ChatGPT Pro subscription)
npm install -g @openai/codex

# Use Codex models (prefix with codex/)
python3 debate.py critique --models codex/gpt-5.2-codex,gemini/gemini-2.0-flash < spec.md

Reasoning effort:

Control how much thinking time the model uses with --codex-reasoning:

# Available levels: low, medium, high, xhigh (default: xhigh)
python3 debate.py critique --models codex/gpt-5.2-codex --codex-reasoning high < spec.md

Higher reasoning effort produces more thorough analysis but uses more tokens.

Available Codex models:

  • codex/gpt-5.2-codex - GPT-5.2 via Codex CLI
  • codex/gpt-5.1-codex-max - GPT-5.1 Max via Codex CLI

Check Codex CLI installation status:

python3 ~/.claude/skills/adversarial-spec/scripts/debate.py providers

Gemini CLI Support

Gemini CLI allows Google account holders to use Gemini models without separate API credits. Models prefixed with gemini-cli/ are routed through the Gemini CLI.

Setup:

# Install Gemini CLI
npm install -g @google/gemini-cli && gemini auth

# Use Gemini CLI models (prefix with gemini-cli/)
python3 debate.py critique --models gemini-cli/gemini-3-pro-preview < spec.md

Available Gemini CLI models:

  • gemini-cli/gemini-3-pro-preview - Gemini 3 Pro via CLI
  • gemini-cli/gemini-3-flash-preview - Gemini 3 Flash via CLI

Check Gemini CLI installation status:

python3 ~/.claude/skills/adversarial-spec/scripts/debate.py providers

OpenAI-Compatible Endpoints

For models that expose an OpenAI-compatible API (local LLMs, self-hosted models, alternative providers), set OPENAI_API_BASE:

# Point to a custom endpoint
export OPENAI_API_KEY="your-key"
export OPENAI_API_BASE="https://your-endpoint.com/v1"

# Use with any model name
python3 debate.py critique --models gpt-4o < spec.md

This works with:

  • Local LLM servers (Ollama, vLLM, text-generation-webui)
  • OpenAI-compatible providers
  • Self-hosted inference endpoints

Usage

Start from scratch:

/adversarial-spec "Build a rate limiter service with Redis backend"

Refine an existing document:

/adversarial-spec ./docs/my-spec.md

You will be prompted for:

  1. Document type: PRD (business/product focus) or tech spec (engineering focus)
  2. Interview mode: Optional in-depth requirements gathering session
  3. Opponent models: Comma-separated list (e.g., gpt-4o,gemini/gemini-2.0-flash,xai/grok-3)

More models = more perspectives = stricter convergence.

Document Types

PRD (Product Requirements Document)

For stakeholders, PMs, and designers.

Sections: Executive Summary, Problem Statement, Target Users/Personas, User Stories, Functional Requirements, Non-Functional Requirements, Success Metrics, Scope (In/Out), Dependencies, Risks

Critique focuses on: Clear problem definition, well-defined personas, measurable success criteria, explicit scope boundaries, no technical implementation details

Technical Specification

For developers and architects.

Sections: Overview, Goals/Non-Goals, System Architecture, Component Design, API Design (full schemas), Data Models, Infrastructure, Security, Error Handling, Performance/SLAs, Observability, Testing Strategy, Deployment Strategy

Critique focuses on: Complete API contracts, data model coverage, security threat mitigation, error handling, specific performance targets, no ambiguity for engineers

Core Features

Interview Mode

Before the debate begins, opt into an in-depth interview session to capture requirements upfront.

Covers: Problem context, users/stakeholders, functional requirements, technical constraints, UI/UX, tradeoffs, risks, success criteria

The interview uses probing fol

Highly Recommended
agents

wshobson/agents

wshobson

Intelligent automation and multi-agent orchestration for Claude Code

The most comprehensive Claude Code plugin ecosystem, covering full-stack development scenarios with a three-tier model strategy balancing performance and cost.

25.6k2.8k3 days ago
Highly Recommended
awesome-claude-skills

ComposioHQ/awesome-claude-skills

ComposioHQ

A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows

The most comprehensive Claude Skills resource list; connect-apps is a killer feature.

19.9k2.0k3 days ago
Recommended
oh-my-opencode

code-yeongyu/oh-my-opencode

code-yeongyu

The Best Agent Harness. Meet Sisyphus: The Batteries-Included Agent that codes like you.

Powerful multi-agent coding tool, but note OAuth limitations.

17.5k1.2k3 days ago
Highly Recommended
ui-ux-pro-max-skill

nextlevelbuilder/ui-ux-pro-max-skill

nextlevelbuilder

An AI SKILL that provide design intelligence for building professional UI/UX multiple platforms

Essential for designers; comprehensive UI/UX knowledge base.

15.3k1.5k3 days ago
Recommended
claude-mem

thedotmack/claude-mem

thedotmack

A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.

A practical solution for Claude's memory issues.

14.0k9143 days ago
Highly Recommended
planning-with-files

OthmanAdi/planning-with-files

OthmanAdi

Claude Code skill implementing Manus-style persistent markdown planning — the workflow pattern behind the $2B acquisition.

Context engineering best practices; an open-source implementation of Manus mode.

9.3k8113 days ago