ahmedasmar/devops-claude-skills

A Claude Code Skills Marketplace for DevOps workflows

License:UnknownLanguage:Python163

Deep Analysis

DevOps 工作流技能合集,涵盖 Terraform、Kubernetes、AWS 成本优化、CI/CD、GitOps 和监控可观测性六大领域

Core Features

Technical Implementation

Highlights
  • 六大 DevOps 领域一站式覆盖
  • AWS 成本优化含 6 个自动分析脚本,首次运行即可发现实际节省
  • GitOps 模块支持 ArgoCD 3.x 和 Flux 2.7 最新版本
  • 监控模块支持四大黄金信号、RED/USE 方法论
  • 提供生产就绪模板(Prometheus 告警、OTel Collector、Runbook 等)
  • 现代密钥管理方案(SOPS+age、External Secrets Operator)
Use Cases
  • DevOps/SRE 团队日常运维自动化
  • AWS 云账单审计与成本治理
  • Kubernetes 集群健康检查与故障诊断
  • 多环境/多集群 GitOps 部署架构设计
  • SLO/SLA 计算与告警规则质量评估
  • CI/CD 流水线性能优化与安全审计
Limitations
  • 深度依赖 AWS 生态,其他云厂商支持有限
  • 各模块独立安装,缺乏统一的跨领域联动
  • 脚本主要面向 CLI 操作,GUI/Dashboard 集成较弱
  • 需要用户具备一定 DevOps 基础知识
Tech Stack
PythonTerraformTerragruntKubernetesArgoCDFluxPrometheusOpenTelemetryAWS CLIGitHub ActionsGitLab CISOPSGrafana

DevOps Skills

Community repository of DevOps-focused skills for Claude Code.

Available Skills

iac-terraform

Infrastructure as Code with Terraform and Terragrunt

Use for creating, validating, troubleshooting, and managing Terraform configurations, modules, and state. Includes state inspection tools, module validators, and comprehensive troubleshooting guides.

k8s-troubleshooter

Systematic Kubernetes troubleshooting and incident response

Diagnose pod failures, cluster issues, performance problems, and production incidents. Features cluster health checks, pod diagnostics, and structured incident response playbooks.

aws-cost-optimization

AWS cost optimization and FinOps workflows

Find unused resources, analyze Reserved Instance opportunities, detect cost anomalies, rightsize instances, evaluate Spot instances, and implement FinOps best practices.

Features:

  • 🔍 6 automated analysis scripts (find waste, analyze RIs, detect old generations, evaluate Spot, rightsize resources, detect anomalies)
  • 📊 Comprehensive reference guides (best practices, service alternatives, FinOps governance)
  • 📝 Monthly cost report template
  • 💰 Proven to find real cost savings on first run
  • ⚡ Full integration with AWS APIs (EC2, RDS, EBS, S3, CloudWatch, Cost Explorer)

ci-cd

CI/CD pipeline design, optimization, security, and troubleshooting

Create workflows, optimize build performance, implement caching, secure pipelines, and debug issues across GitHub Actions, GitLab CI, and other platforms.

gitops-workflows

GitOps workflows with ArgoCD and Flux CD

Implement GitOps practices, deploy to multi-cluster environments, manage secrets securely, implement progressive delivery, and troubleshoot sync issues.

Features:

  • 🚀 8 automated Python scripts (health checks for ArgoCD/Flux, repository validation, drift detection, secret auditing, ApplicationSet generation)
  • 📚 8 comprehensive reference guides (ArgoCD vs Flux comparison, repo patterns, secrets management, multi-cluster, progressive delivery, OCI artifacts, best practices, troubleshooting)
  • 📋 Production-ready templates (ArgoCD 3.x install, Flux bootstrap, ApplicationSets, SOPS+age config, Argo Rollouts canary, OCI artifacts)
  • ✨ Updated for ArgoCD 3.x and Flux 2.7 (2024-2025)
  • 🔐 Modern secrets management (SOPS+age, External Secrets Operator, Sealed Secrets)
  • 🌐 Multi-cluster deployment patterns with ApplicationSets

monitoring-observability

Monitoring and observability strategy and implementation

Design metrics systems, implement distributed tracing, create alerts and dashboards, calculate SLOs and error budgets, and choose the right monitoring tools for your needs.

Features:

  • 📊 6 automated analysis scripts (analyze metrics, check alert quality, calculate SLOs, analyze logs, generate dashboards, validate health checks)
  • 📚 Comprehensive reference guides (metrics design, alerting best practices, logging, tracing, SLO/SLA, tool comparison)
  • 📋 Production-ready templates (Prometheus alerts for web apps and Kubernetes, OpenTelemetry collector config, incident runbooks)
  • 🎯 Four Golden Signals, RED/USE methods, OpenTelemetry integration
  • 🔍 Compare monitoring tools (Prometheus, Datadog, ELK, Loki, CloudWatch)

Installation

Add the marketplace:

/plugin marketplace add https://github.com/ahmedasmar/devops-claude-skills

Install skills:

/plugin install iac-terraform@devops-skills
/plugin install k8s-troubleshooter@devops-skills
/plugin install aws-cost-optimization@devops-skills
/plugin install ci-cd@devops-skills
/plugin install gitops-workflows@devops-skills
/plugin install monitoring-observability@devops-skills

Usage

Once installed, use these skills through Claude Code by describing what you need:

Monitoring & Observability:

  • "Help me set up Prometheus monitoring for my web application"
  • "Create alerts for my service based on SLO best practices"
  • "Calculate my error budget consumption for 99.9% availability"
  • "Design a Grafana dashboard for my Kubernetes cluster"
  • "Should I use Prometheus or Datadog for my startup?"
  • "Implement OpenTelemetry distributed tracing in my Node.js app"
  • "Check the quality of my Prometheus alert rules"

AWS Cost Optimization:

  • "Find unused AWS resources that are costing me money"
  • "Analyze my EC2 instances for Reserved Instance opportunities"
  • "What's the cheapest way to store infrequently accessed data in S3?"
  • "Help me set up a monthly AWS cost review process"
  • "Detect cost anomalies in my AWS spending"

Terraform:

  • "Help me create a reusable Terraform module for VPC"
  • "Review my Terraform state for drift"
  • "Troubleshoot this Terraform error"

Kubernetes:

  • "This pod is in CrashLoopBackOff, help me diagnose it"
  • "Check the health of my Kubernetes cluster"
  • "Help me troubleshoot this deployment"

GitOps:

  • "Set up ArgoCD for my Kubernetes cluster"
  • "Help me design a GitOps repository structure for multi-environment deployments"
  • "My ArgoCD application is OutOfSync, help me troubleshoot"
  • "Implement progressive delivery with canary deployments"
  • "How should I manage secrets in GitOps?"
  • "Set up multi-cluster deployment with Flux"
  • "Should I use ArgoCD or Flux for my platform?"

Contributing

To contribute a new DevOps skill:

  1. Fork this repository
  2. Create a new directory with your skill name (lowercase, hyphenated)
  3. Add .claude-plugin/plugin.json manifest
  4. Add skills/SKILL.md with proper frontmatter
  5. Update .claude-plugin/marketplace.json to include your skill
  6. Submit a pull request

License

MIT