proffesor-for-testing/agentic-qe

Agentic QE Fleet is an open-source AI-powered quality engineering platform designed for use with Claude Code, featuring specialized agents and skills to support testing activities for a product at any stage of the SDLC. Free to use, fork, build, and contribute. Based on the Agentic QE Framework created by Dragan Spiridonov.

License:MITLanguage:TypeScript12125
agenticqeagenticsfoundation代理quality-engineering

Deep Analysis

AI驱动测试自动化平台,48个专业QE代理、ML驱动flaky检测和智能多模型路由实现70-81%成本节省

Recommended

Core Features

48专业QE代理

21核心+11 TDD+15 n8n工作流测试代理

自学习系统

4种RL算法实现20%改进目标

智能多模型路由

HybridRouter自动匹配任务复杂度到最优模型

ML驱动Flaky检测

90%+准确率的统计模式检测

实时可视化

MindMap、Quality Radar等仪表板

100 MCP工具

完整MCP集成用于测试管理

Technical Implementation

Architecture:TypeScript/Node.js分布式多代理系统,原生hooks协调(100-500x更快)
Execution Flow:
代理生成

npm安装初始化,添加MCP服务器

任务分配

CLI或REST API派生测试任务

多代理协调

aqe/*命名空间的原生hooks协调

执行学习

执行任务学习模式优化成本

质量门验证

风险评分和部署就绪评估

Key Components:
TypeScript 5.0+类型安全代理实现
better-sqlite3模式存储和学习持久化
Tree-sitter6语言语义代码解析
OpenTelemetry分布式追踪
Highlights
  • 48专业代理RED-GREEN-REFACTOR TDD协调
  • 自学习每迭代20%改进
  • 70-81%成本节省通过HybridRouter
  • 90%+ flaky测试检测准确率
  • 185事件/秒吞吐量
  • 支持10,000+并发测试
Use Cases
  • AI驱动测试生成和模式匹配
  • 成本优化的任务路由
  • 性能和负载测试
  • 安全扫描SAST/DAST
  • 可访问性WCAG测试
  • 混沌工程验证
Limitations
  • 需要Node.js 20+
  • PostgreSQL用于可选自学习
Tech Stack
TypeScriptNode.js 20+JestPlaywrightk6AnthropicOpenAI

Agentic Quality Engineering Fleet

npm version
License: MIT
TypeScript
Node.js
NPM Downloads
Run in Smithery

Version 2.8.2 | Changelog | Contributors | Issues | Discussions

AI-powered test automation that learns from every task, switches between 300+ AI models on-the-fly, scores code testability, visualizes agent activity in real-time, and improves autonomously overnight — with built-in safety guardrails and full observability.

🎨 Real-Time Visualization | 📊 Testability Scoring | 🧠 QE Agent Learning | 🚀 QUIC Transport | 📋 Constitution System | 📚 46 QE Skills | 🎯 Flaky Detection | 💰 Multi-Model Router | 🔄 n8n Workflow Testing


⚡ Quick Start

Install & Initialize

# Install globally
npm install -g agentic-qe

# Initialize your project
cd your-project
aqe init

# Add MCP server to Claude Code (optional)
claude mcp add agentic-qe npx aqe-mcp

# Verify connection
claude mcp list

Use from Claude Code CLI

Ask Claude to use AQE agents directly from your terminal:

# Generate comprehensive tests
claude "Use qe-test-generator to create tests for src/services/user-service.ts with 95% coverage"

# Run quality pipeline
claude "Initialize AQE fleet: generate tests, execute them, analyze coverage, and run quality gate"

# Detect flaky tests
claude "Use qe-flaky-test-hunter to analyze the last 100 test runs and identify flaky tests"

What gets initialized:

  • Real-time Visualization: Dashboards, interactive graphs, WebSocket streaming
  • Observability Stack: OpenTelemetry, Event Store, Constitution System
  • HybridRouter: Intelligent LLM routing with 70-81% cost savings
  • Self-Learning System: Agents improve with every task (20% target)
  • Pattern Bank: Cross-project pattern reuse (85%+ matching)
  • ML Flaky Detection: 90%+ accuracy with root cause analysis
  • 21 QE Agents: Including Code Intelligence (80% token reduction)
  • 15 n8n Agents: Workflow testing by @fndlalit
  • 11 TDD Subagents: RED/GREEN/REFACTOR phases
  • 46 QE Skills: Including testability-scoring by @fndlalit
  • 8 Slash Commands: Quick access to common workflows

Optional Configuration (.env):

# Enable advanced features (see .env.example)
LLM_MODE=hybrid              # Cost-optimized routing
AQE_RUVECTOR_ENABLED=true    # Self-learning with PostgreSQL

🎯 Why AQE?

Problem AQE Solution
Writing comprehensive tests is tedious and time-consuming AI agents generate tests automatically with pattern reuse across projects
Test suites become slow and expensive at scale Sublinear O(log n) algorithms for coverage analysis and intelligent test selection
Flaky tests waste developer time debugging false failures ML-powered detection (90%+ accuracy) with root cause analysis and fix recommendations
AI testing tools are expensive Multi-model routing cuts costs by up to 70-81% by matching task complexity to model
No memory between test runs—every analysis starts from scratch Self-learning system remembers patterns, strategies, and what works for your codebase
Agents waste tokens reading irrelevant code Code Intelligence provides 80% token reduction with semantic search and knowledge graphs
Tools don't understand your testing frameworks Works with Jest, Cypress, Playwright, Vitest, Mocha, Jasmine, AVA

✨ Features

🧠 Self-Learning Agents That Get Smarter

Unlike traditional testing tools that start from scratch every run, AQE agents build institutional knowledge for your codebase:

What Gets Learned Benefit
Test patterns that work for your framework 85%+ pattern reuse across projects
Optimal strategies for your codebase structure Faster, more relevant test generation
Failure patterns and how to prevent them Proactive defect prevention
Cost-effective routing decisions Automatic budget optimization

4 Reinforcement Learning Algorithms: Q-Learning (default), SARSA, Actor-Critic (A2C), PPO

# Check what your agents have learned
aqe learn status --agent qe-test-generator
aqe patterns list --framework jest

💰 70-81% Cost Savings with Intelligent Routing

HybridRouter automatically matches task complexity to the right model:

Task Type Model Selected Cost
Simple (formatting, syntax) Claude Haiku / GPT-3.5 $0.25/M
Moderate (unit tests, refactoring) Claude Sonnet / GPT-4 Turbo $3/M
Complex (architecture, security) Claude Opus 4.5 / GPT-5 $15/M
Reasoning-heavy DeepSeek R1 / o1-preview Varies

25+ December 2025 models including Claude Opus 4.5, DeepSeek R1 (671B), GPT-5, Gemini 2.5 Pro


🤖 48 Specialized QE Agents

Category Agents Highlights
Core QE 21 agents Test generation, coverage, security, performance, accessibility
TDD Workflow 11 subagents RED/GREEN/REFACTOR phases with coordination
n8n Workflow Testing 15 agents Chaos, compliance, security, BDD scenarios
Base 1 template Create custom agents

Zero external dependencies - Native hooks system runs 100-500x faster than external coordination


🎯 ML-Powered Flaky Test Detection

90%+ accuracy with automated root cause analysis:

  • Statistical pattern detection across test runs
  • Timing analysis and race condition identification
  • Auto-fix recommendations with code suggestions
  • Integration with CI/CD for continuous monitoring
claude "Use qe-flaky-test-hunter to analyze the last 100 test runs"

🔍 Code Intelligence (80% Token Reduction)

Stop wasting tokens on irrelevant code. Semantic search + knowledge graphs deliver only what matters:

  • Tree-sitter parsing for TypeScript, Python, Go, Rust, JavaScript
  • Hybrid search: BM25 + vector similarity with <10ms latency
  • RAG context building for LLM queries
  • Mermaid visualization of code relationships
aqe kg index src/          # Index your codebase
aqe kg search "auth flow"  # Semantic search

📊 Real-Time Visualization

See your agents work with live dashboards:

  • MindMap: 1000+ nodes, <500ms render, WebSocket updates
  • Quality Radar: 7-dimension chart (coverage, security, performance)
  • Timeline: Virtual scrolling for 1000+ events
  • Grafana: Executive, Developer, and QA dashboards

Performance: 185 events/sec throughput, <1ms query latency


🎓 46 World-Class QE Skills

95%+ coverage of modern QE practices - agents automatically apply relevant skills:

View All 46 Skills

Core Testing & Methodologies

  • agentic-quality-engineering, holistic-testing-pact, context-driven-testing
  • tdd-london-chicago, xp-practices, risk-based-testing, test-automation-strategy

Specialized Testing

  • accessibility-testing, mobile-testing, database-testing, contract-testing
  • chaos-engineering-resilience, visual-testing-advanced, compliance-testing

Strategic & Communication

  • six-thinking-hats, brutal-honesty-review, sherlock-review
  • cicd-pipeline-qe-orchestrator, bug-reporting-excellence

n8n Workflow Testing (contributed by @fndlalit)

  • n8n-workflow-testing, n8n-expression-testing, n8n-security-testing

Unique Skills

  • testability-scoring - Score code testability before writing tests
  • qx-partner - QA + UX collaboration for quality experience

📦 Multi-Framework Support

Works with your existing tools:

Category Supported
Unit Testing Jest, Mocha, Vitest, Jasmine, AVA
E2E Testing Cypress, Playwright
Performance k6, JMeter, Gatling
Code Quality ESLint, SonarQube, Lighthouse

Parallel execution: 10,000+ concurrent tests with intelligent orchestration


💻 Usage Examples

Example 1: Single Agent Execution

Ask Claude to use a specific agent:

claude "Use the qe-test-generator agent to create comprehensive tests for src/services/user-service.ts with 95% coverage"

What happens:

  1. Claude Code spawns qe-test-generator via Task tool
  2. Agent analyzes the source file
  3. Generates tests with pattern matching (Phase 2 feature)
  4. Stores results in memory at aqe/test-plan/generated

Output:

Generated 42 tests
Pattern hit rate: 67%
Time saved: 2.3s
Quality score: 96%

Example 2: Multi-Agent Parallel Execution

Coordinate multiple agents at once:

claude "Initialize the AQE fleet:
1. Use qe-test-generator to create tests for src/services/*.ts
2. Use qe-test-executor to run all tests in parallel
3. Use qe-coverage-analyzer to find gaps with sublinear algorithms
4. Use qe-quality-gate to validate against 95% threshold"

What happens:

  1. Claude spawns 4 agents concurrently in a single message
  2. Agents coordinate through aqe/* memory namespace
  3. Pipeline: tes
Highly Recommended
agents

wshobson/agents

wshobson

Intelligent automation and multi-agent orchestration for Claude Code

The most comprehensive Claude Code plugin ecosystem, covering full-stack development scenarios with a three-tier model strategy balancing performance and cost.

25.6k2.8k3 days ago
Highly Recommended
planning-with-files

OthmanAdi/planning-with-files

OthmanAdi

Claude Code skill implementing Manus-style persistent markdown planning — the workflow pattern behind the $2B acquisition.

Context engineering best practices; an open-source implementation of Manus mode.

9.3k8113 days ago
Recommended
Continuous-Claude-v3

parcadei/Continuous-Claude-v3

parcadei

Context management for Claude Code. Hooks maintain state via ledgers and handoffs. MCP execution without context pollution. Agent orchestration with isolated context windows.

Enterprise-grade engineering workflow with cutting-edge multi-agent architecture.

3.2k2353 days ago
Recommended
claude-code-settings

feiskyer/claude-code-settings

feiskyer

Claude Code settings, commands and agents for vibe coding

Feature-complete, mature ecosystem, multi-model support.

1.1k1674 days ago
Highly Recommended
awesomeAgentskills

littleben/awesomeAgentskills

littleben

A curated collection of skills for Claude Code and other AI agents | 精选的 Claude Code 和其他 AI 智能体技能集合

High-quality skill collection covering SaaS full-stack needs.

90145 days ago
推荐使用
codex-settings

feiskyer/codex-settings

feiskyer

OpenAI Codex CLI settings, configurations, skills and prompts for vibe coding

Mature and complete Codex configuration and extension suite.

89165 days ago