applied-artificial-intelligence/claude-code-toolkit

Production-tested commands, skills, and workflow patterns for Claude Code. Developed through 6+ months of daily use. Includes explore→plan→next→ship workflow, session handoffs, MCP integrations, and domain skills. Copy what works, adapt to your needs.

License:MITLanguage:Python267
代理anthropicclaude-codecoding-agentsframework插件toolkit

Deep Analysis

针对 Claude Code 的生产级工具包,包含 28 条命令、5 个智能体和 6 个领域技能,通过 6 个月的日常使用经验验证

Core Features

Technical Implementation

Highlights
  • 生产验证:6 个月+日常使用经验
  • 与 Anthropic 官方最佳实践完全对齐
  • 无需 MCP 即可运行,有 MCP 时功能增强
  • 渐进式信息披露模式,节省 70%+ token
  • Token 限制强制工具(mdtoken)防止文件膨胀
  • 5 个专用智能体(架构师、测试工程师、代码审查员等)
  • 自动 git 安全防护
Use Cases
  • 系统化任务执行:从需求分析到代码交付的完整工作流
  • 会话管理:跨上下文窗口边界的无缝工作继续
  • 代码质量保证:深度代码分析、审查、测试创建
  • 项目持久化记忆:跨会话的项目知识积累
  • MCP 集成工作流
  • 领域特定开发:量化金融、专业写作等
Limitations
  • 依赖 Claude Code 2.0+
  • MCP 服务器为可选项,某些高级功能受限
  • 需要 jq、Node.js 等外部依赖
  • 不能克服 Claude 的基础行为限制
  • 上下文窗口大小仍是核心制约因素
  • 复杂工作流的学习曲线相对陡峭
Tech Stack
PythonJavaScript/Node.jsMarkdownJSONGitBashjqMCPClaude Code 2.0+

Claude Code Toolkit

Production-tested commands, skills, and patterns for Claude Code

License: MIT
Claude Code

Task Execution Workflow: Explore → Plan → Next → Ship


Overview

A collection of plugins, skills, and patterns developed through 6+ months of daily Claude Code use. Copy what works, adapt to your needs.

What's included: 28 commands, 5 agents, and 6 domain skills across 6 core plugins.

Design Goals

  1. Stateless architecture - Commands execute independently, state persisted in files
  2. File-based persistence - JSON and Markdown for all state management
  3. MCP integration - Optional Model Context Protocol tools with graceful degradation
  4. Progressive disclosure - Load context incrementally to optimize token usage
  5. Self-containment - All logic inline, no external script dependencies

Key Capabilities

  • Workflow management: exploreplannextship pattern
  • Memory persistence: Cross-session context with automatic reflection
  • Quality automation: Git safety, pre/post hooks, compliance auditing
  • MCP integration: Optional tools (Serena, Sequential Thinking) with graceful degradation
  • Domain adaptation: Examples showing how to extend Claude Code beyond software (quant, writing)

Reality Check: The Limits of Customization

Before diving into the toolkit, understand what you're working with.


The Instruction Training Boundary

Claude's behavior comes from two layers:

Layer What It Contains Can You Change It?
Instruction training Core personality, instincts, behavioral drivers No—immutable
Customization System prompts, CLAUDE.md, agent definitions Yes—but limited

No matter how clever your prompting, you're working with Claude's pre-trained personality. You can nudge it, structure it, guide it—but the underlying instincts remain.


Behaviors You'll Encounter

Sycophancy

The tendency to agree, validate, and praise. We developed elaborate anti-sycophancy protocols—they failed. Give contradictory prompts and you'll still get "You're absolutely right!"


Completion Bias

The urge to deliver results regardless of completeness. Claude will proceed without full specifications, filling gaps with assumptions rather than asking. It's trained to deliver, not to pause and question.


Action Over Reflection

Bias toward doing, not critically examining. Tunnel vision on declared goals. Big picture thinking and honest pushback require deliberate prompting.


Context Limitations

The precious context window determines what Claude can "see." Important details get forgotten, overlooked, or deprioritized. Specify too much and things get lost; too little and Claude fills gaps incorrectly.


What This Means for You

Thorough inspection is non-negotiable. Claude can make mistakes in every respect. Never assume output matches expectations—everything requires review, especially things that look correct.


Proper testing is essential. Validate behavior, not just structure. Test edge cases Claude may not have considered. Don't trust "it should work"—verify it does.


The toolkit helps, but doesn't eliminate these behaviors. We provide structure, workflows, and guardrails. Claude's base personality operates within that structure—sometimes it will frustrate you because it just wants to deliver.


Why We're Honest About This

We removed dubious metrics from this toolkit ("80% token reduction," "zero hallucinations") because they weren't validated—and overclaiming doesn't help anyone.

Claude Code is genuinely powerful:

  • Amazing at navigating complex codebases
  • Excellent at using tools to solve problems
  • Incredibly convenient in the terminal environment

But it has real limitations you'll encounter repeatedly. This toolkit provides patterns for working with those constraints rather than pretending they don't exist.



Anthropic Best Practices Alignment

This toolkit implements patterns and recommendations from Anthropic's official Claude Code documentation. It represents "what Anthropic says you should be doing, here implemented."

Anthropic Best Practices Alignment

Multi-Context Window Workflows

Anthropic Recommendation (Claude 4 Best Practices):

"For complex, multi-step projects, we recommend starting with a first context window devoted to designing a structured representation of the project—tests, code stubs, documentation—that can be passed into subsequent context windows... Claude is adept at working across context window resets as long as clear instructions and todo lists are provided."

Toolkit Implementation:

  • /workflow:explore/workflow:plan/workflow:next/workflow:ship workflow
  • state.json tracks task status across sessions (equivalent to Anthropic's tests.json)
  • /transition:handoff + /transition:continue for explicit context window transitions
  • TodoWrite integration for progress tracking across resets

State Tracking with JSON + Unstructured Notes + Git

Anthropic Recommendation:

"JSON provides structured data about what still needs to be done... Unstructured progress notes can be easier for Claude to incrementally update... Git checkpoints let Claude 'save progress' so it has a safe state to roll back to."

Toolkit Implementation:

  • JSON: state.json (task tracking), metadata.json (work unit metadata)
  • Unstructured: exploration.md (analysis notes), implementation-plan.md (task breakdown)
  • Git checkpoints: Automatic commits after each /workflow:next task completion

Progressive Disclosure Pattern

Anthropic Recommendation (Agent Skills):

"Metadata Layer: Skill name and description are pre-loaded... Core Documentation: The full SKILL.md content loads when Claude determines relevance... Reference Materials: Additional linked files load only as needed."

Toolkit Implementation:

  • Skills (skills/) use Anthropic's exact 3-layer disclosure pattern
  • Plugins load command metadata first, full instructions only when invoked
  • Memory (/memory:*) stores context for retrieval, not constant presence
  • Result: 70%+ token savings vs. loading everything upfront

Model-Invoked vs User-Invoked Actions

Anthropic Recommendation:

"Skills are model-invoked—Claude autonomously decides when to use them based on your request and the Skill's description. This is different from slash commands, which are user-invoked."

Toolkit Implementation:

  • Skills (skills/): Model-invoked, Claude loads when domain-relevant (RAG, Docker, SQL, etc.)
  • Commands (commands/): User-invoked via /command syntax
  • Agents (agents/): Claude-selected via Task tool based on task complexity
  • Clear separation between automatic (skills) and explicit (commands) invocation

Memory Tool Patterns

Anthropic Recommendation (Memory Tool Beta):

"ALWAYS VIEW YOUR MEMORY DIRECTORY BEFORE DOING ANYTHING ELSE... Record status/progress/thoughts in your memory. ASSUME INTERRUPTION: Your context window might be reset at any moment."

Toolkit Implementation:

  • .claude/memory/ directory for persistent project knowledge
  • /memory:index creates persistent project understanding
  • /memory:memory-review displays current memory state
  • /transition:handoff saves critical context before session boundaries
  • Memory philosophy: Write down key decisions, discard outdated information

Quality Gates and Hooks

Anthropic Recommendation:

"Use pre-commit hooks to enforce code quality standards."

Toolkit Implementation:

  • git-safe-commit wrapper blocks --no-verify (no bypassing quality checks)
  • hooks/ directory for event handlers (pre/post tool use)
  • Example ruff-check-hook.sh demonstrates actionable feedback patterns
  • /system:audit validates framework compliance

Subagent Architecture

Anthropic Recommendation (Subagents):

"Use subagents for complex tasks that benefit from specialized focus... Each agent has its own context window, custom prompt, and can have restricted tool access."

Toolkit Implementation:

  • 5 specialized agents: architect, test-engineer, code-reviewer, auditor, reasoning-specialist
  • Agents invoked via Claude Code's native Task tool
  • Each agent has focused responsibility and appropriate tool permissions
  • Pattern: Match agent specialization to task complexity

Graceful Degradation

Anthropic Recommendation (implicit in MCP documentation):

"All features should work without MCP tools, enhanced when available."

Toolkit Implementation:

  • Every command works without MCP servers
  • MCP enhances but never required for core functionality
  • Commands auto-detect MCP availability and adapt
  • Clear documentation of MCP benefits vs. baseline behavior

Why This Alignment Matters

This toolkit evolved through 6+ months of daily Claude Code use, facing the same pain points that Anthropic has been addressing in their platform improvements. Context limits, session boundaries, state persistence, quality degradation—these challenges are inherent to working with LLMs in development workflows.

It's no surprise that our solutions align with Anthropic's recommendations: **we're solving the same

Highly Recommended
agents

wshobson/agents

wshobson

Intelligent automation and multi-agent orchestration for Claude Code

The most comprehensive Claude Code plugin ecosystem, covering full-stack development scenarios with a three-tier model strategy balancing performance and cost.

25.6k2.8k3 days ago
Highly Recommended
awesome-claude-skills

ComposioHQ/awesome-claude-skills

ComposioHQ

A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows

The most comprehensive Claude Skills resource list; connect-apps is a killer feature.

19.9k2.0k3 days ago
Recommended
oh-my-opencode

code-yeongyu/oh-my-opencode

code-yeongyu

The Best Agent Harness. Meet Sisyphus: The Batteries-Included Agent that codes like you.

Powerful multi-agent coding tool, but note OAuth limitations.

17.5k1.2k3 days ago
Highly Recommended
ui-ux-pro-max-skill

nextlevelbuilder/ui-ux-pro-max-skill

nextlevelbuilder

An AI SKILL that provide design intelligence for building professional UI/UX multiple platforms

Essential for designers; comprehensive UI/UX knowledge base.

15.3k1.5k3 days ago
Recommended
claude-mem

thedotmack/claude-mem

thedotmack

A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.

A practical solution for Claude's memory issues.

14.0k9143 days ago
Highly Recommended
planning-with-files

OthmanAdi/planning-with-files

OthmanAdi

Claude Code skill implementing Manus-style persistent markdown planning — the workflow pattern behind the $2B acquisition.

Context engineering best practices; an open-source implementation of Manus mode.

9.3k8113 days ago