applied-artificial-intelligence/claude-code-toolkit
Production-tested commands, skills, and workflow patterns for Claude Code. Developed through 6+ months of daily use. Includes explore→plan→next→ship workflow, session handoffs, MCP integrations, and domain skills. Copy what works, adapt to your needs.
Deep Analysis
针对 Claude Code 的生产级工具包,包含 28 条命令、5 个智能体和 6 个领域技能,通过 6 个月的日常使用经验验证
Core Features
Technical Implementation
- 生产验证:6 个月+日常使用经验
- 与 Anthropic 官方最佳实践完全对齐
- 无需 MCP 即可运行,有 MCP 时功能增强
- 渐进式信息披露模式,节省 70%+ token
- Token 限制强制工具(mdtoken)防止文件膨胀
- 5 个专用智能体(架构师、测试工程师、代码审查员等)
- 自动 git 安全防护
- 系统化任务执行:从需求分析到代码交付的完整工作流
- 会话管理:跨上下文窗口边界的无缝工作继续
- 代码质量保证:深度代码分析、审查、测试创建
- 项目持久化记忆:跨会话的项目知识积累
- MCP 集成工作流
- 领域特定开发:量化金融、专业写作等
- 依赖 Claude Code 2.0+
- MCP 服务器为可选项,某些高级功能受限
- 需要 jq、Node.js 等外部依赖
- 不能克服 Claude 的基础行为限制
- 上下文窗口大小仍是核心制约因素
- 复杂工作流的学习曲线相对陡峭
Claude Code Toolkit
Production-tested commands, skills, and patterns for Claude Code

Overview
A collection of plugins, skills, and patterns developed through 6+ months of daily Claude Code use. Copy what works, adapt to your needs.
What's included: 28 commands, 5 agents, and 6 domain skills across 6 core plugins.
Design Goals
- Stateless architecture - Commands execute independently, state persisted in files
- File-based persistence - JSON and Markdown for all state management
- MCP integration - Optional Model Context Protocol tools with graceful degradation
- Progressive disclosure - Load context incrementally to optimize token usage
- Self-containment - All logic inline, no external script dependencies
Key Capabilities
- Workflow management:
explore→plan→next→shippattern - Memory persistence: Cross-session context with automatic reflection
- Quality automation: Git safety, pre/post hooks, compliance auditing
- MCP integration: Optional tools (Serena, Sequential Thinking) with graceful degradation
- Domain adaptation: Examples showing how to extend Claude Code beyond software (quant, writing)
Reality Check: The Limits of Customization
Before diving into the toolkit, understand what you're working with.
The Instruction Training Boundary
Claude's behavior comes from two layers:
| Layer | What It Contains | Can You Change It? |
|---|---|---|
| Instruction training | Core personality, instincts, behavioral drivers | No—immutable |
| Customization | System prompts, CLAUDE.md, agent definitions | Yes—but limited |
No matter how clever your prompting, you're working with Claude's pre-trained personality. You can nudge it, structure it, guide it—but the underlying instincts remain.
Behaviors You'll Encounter
Sycophancy
The tendency to agree, validate, and praise. We developed elaborate anti-sycophancy protocols—they failed. Give contradictory prompts and you'll still get "You're absolutely right!"
Completion Bias
The urge to deliver results regardless of completeness. Claude will proceed without full specifications, filling gaps with assumptions rather than asking. It's trained to deliver, not to pause and question.
Action Over Reflection
Bias toward doing, not critically examining. Tunnel vision on declared goals. Big picture thinking and honest pushback require deliberate prompting.
Context Limitations
The precious context window determines what Claude can "see." Important details get forgotten, overlooked, or deprioritized. Specify too much and things get lost; too little and Claude fills gaps incorrectly.
What This Means for You
Thorough inspection is non-negotiable. Claude can make mistakes in every respect. Never assume output matches expectations—everything requires review, especially things that look correct.
Proper testing is essential. Validate behavior, not just structure. Test edge cases Claude may not have considered. Don't trust "it should work"—verify it does.
The toolkit helps, but doesn't eliminate these behaviors. We provide structure, workflows, and guardrails. Claude's base personality operates within that structure—sometimes it will frustrate you because it just wants to deliver.
Why We're Honest About This
We removed dubious metrics from this toolkit ("80% token reduction," "zero hallucinations") because they weren't validated—and overclaiming doesn't help anyone.
Claude Code is genuinely powerful:
- Amazing at navigating complex codebases
- Excellent at using tools to solve problems
- Incredibly convenient in the terminal environment
But it has real limitations you'll encounter repeatedly. This toolkit provides patterns for working with those constraints rather than pretending they don't exist.
Anthropic Best Practices Alignment
This toolkit implements patterns and recommendations from Anthropic's official Claude Code documentation. It represents "what Anthropic says you should be doing, here implemented."

Multi-Context Window Workflows
Anthropic Recommendation (Claude 4 Best Practices):
"For complex, multi-step projects, we recommend starting with a first context window devoted to designing a structured representation of the project—tests, code stubs, documentation—that can be passed into subsequent context windows... Claude is adept at working across context window resets as long as clear instructions and todo lists are provided."
Toolkit Implementation:
/workflow:explore→/workflow:plan→/workflow:next→/workflow:shipworkflowstate.jsontracks task status across sessions (equivalent to Anthropic'stests.json)/transition:handoff+/transition:continuefor explicit context window transitions- TodoWrite integration for progress tracking across resets
State Tracking with JSON + Unstructured Notes + Git
Anthropic Recommendation:
"JSON provides structured data about what still needs to be done... Unstructured progress notes can be easier for Claude to incrementally update... Git checkpoints let Claude 'save progress' so it has a safe state to roll back to."
Toolkit Implementation:
- JSON:
state.json(task tracking),metadata.json(work unit metadata) - Unstructured:
exploration.md(analysis notes),implementation-plan.md(task breakdown) - Git checkpoints: Automatic commits after each
/workflow:nexttask completion
Progressive Disclosure Pattern
Anthropic Recommendation (Agent Skills):
"Metadata Layer: Skill name and description are pre-loaded... Core Documentation: The full SKILL.md content loads when Claude determines relevance... Reference Materials: Additional linked files load only as needed."
Toolkit Implementation:
- Skills (
skills/) use Anthropic's exact 3-layer disclosure pattern - Plugins load command metadata first, full instructions only when invoked
- Memory (
/memory:*) stores context for retrieval, not constant presence - Result: 70%+ token savings vs. loading everything upfront
Model-Invoked vs User-Invoked Actions
Anthropic Recommendation:
"Skills are model-invoked—Claude autonomously decides when to use them based on your request and the Skill's description. This is different from slash commands, which are user-invoked."
Toolkit Implementation:
- Skills (
skills/): Model-invoked, Claude loads when domain-relevant (RAG, Docker, SQL, etc.) - Commands (
commands/): User-invoked via/commandsyntax - Agents (
agents/): Claude-selected via Task tool based on task complexity - Clear separation between automatic (skills) and explicit (commands) invocation
Memory Tool Patterns
Anthropic Recommendation (Memory Tool Beta):
"ALWAYS VIEW YOUR MEMORY DIRECTORY BEFORE DOING ANYTHING ELSE... Record status/progress/thoughts in your memory. ASSUME INTERRUPTION: Your context window might be reset at any moment."
Toolkit Implementation:
.claude/memory/directory for persistent project knowledge/memory:indexcreates persistent project understanding/memory:memory-reviewdisplays current memory state/transition:handoffsaves critical context before session boundaries- Memory philosophy: Write down key decisions, discard outdated information
Quality Gates and Hooks
Anthropic Recommendation:
"Use pre-commit hooks to enforce code quality standards."
Toolkit Implementation:
git-safe-commitwrapper blocks--no-verify(no bypassing quality checks)hooks/directory for event handlers (pre/post tool use)- Example
ruff-check-hook.shdemonstrates actionable feedback patterns /system:auditvalidates framework compliance
Subagent Architecture
Anthropic Recommendation (Subagents):
"Use subagents for complex tasks that benefit from specialized focus... Each agent has its own context window, custom prompt, and can have restricted tool access."
Toolkit Implementation:
- 5 specialized agents:
architect,test-engineer,code-reviewer,auditor,reasoning-specialist - Agents invoked via Claude Code's native Task tool
- Each agent has focused responsibility and appropriate tool permissions
- Pattern: Match agent specialization to task complexity
Graceful Degradation
Anthropic Recommendation (implicit in MCP documentation):
"All features should work without MCP tools, enhanced when available."
Toolkit Implementation:
- Every command works without MCP servers
- MCP enhances but never required for core functionality
- Commands auto-detect MCP availability and adapt
- Clear documentation of MCP benefits vs. baseline behavior
Why This Alignment Matters
This toolkit evolved through 6+ months of daily Claude Code use, facing the same pain points that Anthropic has been addressing in their platform improvements. Context limits, session boundaries, state persistence, quality degradation—these challenges are inherent to working with LLMs in development workflows.
It's no surprise that our solutions align with Anthropic's recommendations: **we're solving the same
Related Skills
wshobson/agents
wshobsonIntelligent automation and multi-agent orchestration for Claude Code
The most comprehensive Claude Code plugin ecosystem, covering full-stack development scenarios with a three-tier model strategy balancing performance and cost.
ComposioHQ/awesome-claude-skills
ComposioHQA curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows
The most comprehensive Claude Skills resource list; connect-apps is a killer feature.
code-yeongyu/oh-my-opencode
code-yeongyuThe Best Agent Harness. Meet Sisyphus: The Batteries-Included Agent that codes like you.
Powerful multi-agent coding tool, but note OAuth limitations.
nextlevelbuilder/ui-ux-pro-max-skill
nextlevelbuilderAn AI SKILL that provide design intelligence for building professional UI/UX multiple platforms
Essential for designers; comprehensive UI/UX knowledge base.
thedotmack/claude-mem
thedotmackA Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.
A practical solution for Claude's memory issues.
OthmanAdi/planning-with-files
OthmanAdiClaude Code skill implementing Manus-style persistent markdown planning — the workflow pattern behind the $2B acquisition.
Context engineering best practices; an open-source implementation of Manus mode.

