Deep Analysis

Automatically converts documentation websites, GitHub repositories, and PDFs into Claude AI skills, completing in minutes what would take hours.

Highly Recommended

Core Features

Supports documentation websites, GitHub repositories, and PDF files

Auto-detects LLM-friendly documentation format for 10x acceleration

Deep code parsing, detects documentation and code conflicts

Extracts best examples and key concepts

Technical Implementation

Architecture:Crawler + AST Analysis + AI Enhancement + Packaging

Execution Flow:

Key Components:

Web Scraper

AST Parser

MCP

Highlights

Complete skill creation in 20-40 minutes instead of hours
Automatically detect conflicts between documentation and code implementation
llms.txt format support for 10x acceleration
Validated with 700+ test cases

Use Cases

Create Claude skills for any framework/API
Convert game engine docs to skills (Godot, Unity)
Merge internal docs + code repos into skills
Discover inconsistencies between documentation and code

Limitations

Requires Python 3.10+
Complex websites may need scraping configuration adjustments

Tech Stack

Python 3.10+MCPAST Parser

README

View on GitHub

Skill Seeker

Automatically convert documentation websites, GitHub repositories, and PDFs into Claude AI skills in minutes.

📋 View Development Roadmap & Tasks - 134 tasks across 10 categories, pick any to contribute!

What is Skill Seeker?

Skill Seeker is an automated tool that transforms documentation websites, GitHub repositories, and PDF files into production-ready Claude AI skills. Instead of manually reading and summarizing documentation, Skill Seeker:

Scrapes multiple sources (docs, GitHub repos, PDFs) automatically
Analyzes code repositories with deep AST parsing
Detects conflicts between documentation and code implementation
Organizes content into categorized reference files
Enhances with AI to extract best examples and key concepts
Packages everything into an uploadable .zip file for Claude

Result: Get comprehensive Claude skills for any framework, API, or tool in 20-40 minutes instead of hours of manual work.

Why Use This?

🎯 For Developers: Create skills from documentation + GitHub repos with conflict detection
🎮 For Game Devs: Generate skills for game engines (Godot docs + GitHub, Unity, etc.)
🔧 For Teams: Combine internal docs + code repositories into single source of truth
📚 For Learners: Build comprehensive skills from docs, code examples, and PDFs
🔍 For Open Source: Analyze repos to find documentation gaps and outdated examples

Key Features

🌐 Documentation Scraping

✅ llms.txt Support - Automatically detects and uses LLM-ready documentation files (10x faster)
✅ Universal Scraper - Works with ANY documentation website
✅ Smart Categorization - Automatically organizes content by topic
✅ Code Language Detection - Recognizes Python, JavaScript, C++, GDScript, etc.
✅ 8 Ready-to-Use Presets - Godot, React, Vue, Django, FastAPI, and more

📄 PDF Support (v1.2.0)

✅ Basic PDF Extraction - Extract text, code, and images from PDF files
✅ OCR for Scanned PDFs - Extract text from scanned documents
✅ Password-Protected PDFs - Handle encrypted PDFs
✅ Table Extraction - Extract complex tables from PDFs
✅ Parallel Processing - 3x faster for large PDFs
✅ Intelligent Caching - 50% faster on re-runs

🐙 GitHub Repository Scraping (v2.0.0)

✅ Deep Code Analysis - AST parsing for Python, JavaScript, TypeScript, Java, C++, Go
✅ API Extraction - Functions, classes, methods with parameters and types
✅ Repository Metadata - README, file tree, language breakdown, stars/forks
✅ GitHub Issues & PRs - Fetch open/closed issues with labels and milestones
✅ CHANGELOG & Releases - Automatically extract version history
✅ Conflict Detection - Compare documented APIs vs actual code implementation
✅ MCP Integration - Natural language: "Scrape GitHub repo facebook/react"

🔄 Unified Multi-Source Scraping (NEW - v2.0.0)

✅ Combine Multiple Sources - Mix documentation + GitHub + PDF in one skill
✅ Conflict Detection - Automatically finds discrepancies between docs and code
✅ Intelligent Merging - Rule-based or AI-powered conflict resolution
✅ Transparent Reporting - Side-by-side comparison with ⚠️ warnings
✅ Documentation Gap Analysis - Identifies outdated docs and undocumented features
✅ Single Source of Truth - One skill showing both intent (docs) and reality (code)
✅ Backward Compatible - Legacy single-source configs still work

🤖 Multi-LLM Platform Support (NEW - v2.5.0)

✅ 4 LLM Platforms - Claude AI, Google Gemini, OpenAI ChatGPT, Generic Markdown
✅ Universal Scraping - Same documentation works for all platforms
✅ Platform-Specific Packaging - Optimized formats for each LLM
✅ One-Command Export - --target flag selects platform
✅ Optional Dependencies - Install only what you need
✅ 100% Backward Compatible - Existing Claude workflows unchanged

Platform	Format	Upload	Enhancement	API Key
Claude AI	ZIP + YAML	✅ Auto	✅ Yes	ANTHROPIC_API_KEY
Google Gemini	tar.gz	✅ Auto	✅ Yes	GOOGLE_API_KEY
OpenAI ChatGPT	ZIP + Vector Store	✅ Auto	✅ Yes	OPENAI_API_KEY
Generic Markdown	ZIP	❌ Manual	❌ No	None

# Claude (default - no changes needed!)
skill-seekers package output/react/
skill-seekers upload react.zip

# Google Gemini
pip install skill-seekers[gemini]
skill-seekers package output/react/ --target gemini
skill-seekers upload react-gemini.tar.gz --target gemini

# OpenAI ChatGPT
pip install skill-seekers[openai]
skill-seekers package output/react/ --target openai
skill-seekers upload react-openai.zip --target openai

# Generic Markdown (universal export)
skill-seekers package output/react/ --target markdown
# Use the markdown files directly in any LLM

Installation:

# Install with Gemini support
pip install skill-seekers[gemini]

# Install with OpenAI support
pip install skill-seekers[openai]

# Install with all LLM platforms
pip install skill-seekers[all-llms]

🌊 Three-Stream GitHub Architecture (NEW - v2.6.0)

✅ Triple-Stream Analysis - Split GitHub repos into Code, Docs, and Insights streams
✅ Unified Codebase Analyzer - Works with GitHub URLs AND local paths
✅ C3.x as Analysis Depth - Choose 'basic' (1-2 min) or 'c3x' (20-60 min) analysis
✅ Enhanced Router Generation - GitHub metadata, README quick start, common issues
✅ Issue Integration - Top problems and solutions from GitHub issues
✅ Smart Routing Keywords - GitHub labels weighted 2x for better topic detection
✅ 81 Tests Passing - Comprehensive E2E validation (0.44 seconds)

Three Streams Explained:

Stream 1: Code - Deep C3.x analysis (patterns, examples, guides, configs, architecture)
Stream 2: Docs - Repository documentation (README, CONTRIBUTING, docs/*.md)
Stream 3: Insights - Community knowledge (issues, labels, stars, forks)

from skill_seekers.cli.unified_codebase_analyzer import UnifiedCodebaseAnalyzer

# Analyze GitHub repo with all three streams
analyzer = UnifiedCodebaseAnalyzer()
result = analyzer.analyze(
    source="https://github.com/facebook/react",
    depth="c3x",  # or "basic" for fast analysis
    fetch_github_metadata=True
)

# Access code stream (C3.x analysis)
print(f"Design patterns: {len(result.code_analysis['c3_1_patterns'])}")
print(f"Test examples: {result.code_analysis['c3_2_examples_count']}")

# Access docs stream (repository docs)
print(f"README: {result.github_docs['readme'][:100]}")

# Access insights stream (GitHub metadata)
print(f"Stars: {result.github_insights['metadata']['stars']}")
print(f"Common issues: {len(result.github_insights['common_problems'])}")

See complete documentation: Three-Stream Implementation Summary

🔐 Private Config Repositories (NEW - v2.2.0)

✅ Git-Based Config Sources - Fetch configs from private/team git repositories
✅ Multi-Source Management - Register unlimited GitHub, GitLab, Bitbucket repos
✅ Team Collaboration - Share custom configs across 3-5 person teams
✅ Enterprise Support - Scale to 500+ developers with priority-based resolution
✅ Secure Authentication - Environment variable tokens (GITHUB_TOKEN, GITLAB_TOKEN)
✅ Intelligent Caching - Clone once, pull updates automatically
✅ Offline Mode - Work with cached configs when offline
✅ Backward Compatible - Existing API-based configs still work

🤖 Codebase Analysis & AI Enhancement (C3.x - NEW!)

C3.4: Configuration Pattern Extraction with AI Enhancement

✅ 9 Config Formats - JSON, YAML, TOML, ENV, INI, Python, JavaScript, Dockerfile, Docker Compose
✅ 7 Pattern Types - Database, API, logging, cache, email, auth, server configurations
✅ AI Enhancement (NEW!) - Optional dual-mode AI analysis (API + LOCAL, like C3.3)
- Explains what each config does
- Suggests best practices and improvements
- Security analysis - Finds hardcoded secrets, exposed credentials
- Migration suggestions - Consolidation opportunities
- Context-aware documentation
✅ Auto-Documentation - Generates JSON + Markdown documentation of all configs
✅ Type Inference - Automatically detects setting types and environment variables
✅ MCP Integration - extract_config_patterns tool with enhancement support

C3.3: AI-Enhanced How-To Guides

✅ Comprehensive AI Enhancement - Transforms basic guides (⭐⭐) into professional tutorials (⭐⭐
... (内容已截断)

yusufkaraaslan/Skill_Seekers

Deep Analysis

Core Features

Technical Implementation

Skill Seeker

What is Skill Seeker?

Why Use This?

Key Features

🌐 Documentation Scraping

📄 PDF Support (v1.2.0)

🐙 GitHub Repository Scraping (v2.0.0)

🔄 Unified Multi-Source Scraping (NEW - v2.0.0)

🤖 Multi-LLM Platform Support (NEW - v2.5.0)

🌊 Three-Stream GitHub Architecture (NEW - v2.6.0)

🔐 Private Config Repositories (NEW - v2.2.0)

🤖 Codebase Analysis & AI Enhancement (C3.x - NEW!)

wshobson/agents

ComposioHQ/awesome-claude-skills

code-yeongyu/oh-my-opencode

thedotmack/claude-mem

OthmanAdi/planning-with-files

K-Dense-AI/claude-scientific-skills

🔍Deep Analysis

Core Features

🔧Technical Implementation

Skill Seeker

What is Skill Seeker?

Why Use This?

Key Features

🌐 Documentation Scraping

📄 PDF Support (v1.2.0)

🐙 GitHub Repository Scraping (v2.0.0)

🔄 Unified Multi-Source Scraping (NEW - v2.0.0)

🤖 Multi-LLM Platform Support (NEW - v2.5.0)

🌊 Three-Stream GitHub Architecture (NEW - v2.6.0)

🔐 Private Config Repositories (NEW - v2.2.0)

🤖 Codebase Analysis & AI Enhancement (C3.x - NEW!)

Related Skills

wshobson/agents

ComposioHQ/awesome-claude-skills

code-yeongyu/oh-my-opencode

thedotmack/claude-mem

OthmanAdi/planning-with-files

K-Dense-AI/claude-scientific-skills

Deep Analysis

Technical Implementation