Auto-documentation engine using LLMs to generate insightful documentation with architecture diagrams
Project description
SourceScribe
An intelligent auto-documentation engine that generates feature-based, process-oriented documentation with extensive visual diagrams.
Powered by LLMs (Claude, OpenAI, Ollama) and designed for developers who want documentation that explains how to USE the system, not just browse source files.
๐ฏ Different from other doc tools: SourceScribe organizes docs by features & workflows with 10+ diagrams, not by individual source files.
๐ Table of Contents
- Key Features
- Installation
- Quick Start
- Configuration
- Usage Examples
- Documentation Philosophy
- Architecture
- How It Works
- Example Output
- Docusaurus Integration
- Development
- Why SourceScribe?
- Roadmap
โจ Key Features
- ๐ฏ Feature-Based Documentation: Organizes by capabilities and workflows, not file structure
- ๐ Diagram-Rich: Generates 10+ Mermaid diagrams (sequence, flowchart, architecture, class)
- ๐ Process-Oriented: Explains "How it Works" with visual workflows
- ๐ User-Centric: Written for developers who want to USE the system
- ๐ GitHub Permalinks: Automatically links to actual code with line-level precision
- โจ Auto-Sidebar Generation: Automatically generates Docusaurus
sidebars.ts- no manual config! - ๐ค Multi-LLM Support: Claude (Anthropic), OpenAI (GPT-4), and Ollama
- ๐๏ธ Real-time Watching: Monitors code changes and auto-regenerates docs
- ๐ Multi-language: Supports Python, TypeScript, Java, Go, Rust, and more
- โ๏ธ Configurable: Flexible YAML-based configuration with Pydantic models
- ๐ Cross-platform: Works on macOS, Linux, and Windows
- ๐ข GitHub Actions Ready: Works seamlessly in any project's CI/CD pipeline
Installation
# Clone the repository
git clone https://github.com/source-scribe/sourcescribe-core.git
cd sourcescribe-core
# Install dependencies
pip install -r requirements.txt
# Or install in development mode
pip install -e .
Quick Start
1. Configure API Keys
Set up your LLM API keys as environment variables:
export ANTHROPIC_API_KEY="your-anthropic-key"
export OPENAI_API_KEY="your-openai-key"
# Ollama runs locally, no key needed
2. Initialize a Project
sourcescribe init /path/to/your/project
This creates a .sourcescribe.yaml configuration file.
3. Generate Documentation
# Generate feature-based documentation
sourcescribe generate .
# Specify output directory
sourcescribe generate . --output ./docs/api-reference
# Use specific LLM provider
sourcescribe generate . --provider anthropic --model claude-3-haiku-20240307
# Watch mode (auto-regenerate on changes)
sourcescribe watch .
4. View Your Documentation
SourceScribe generates a feature-based documentation structure:
docs/
โโโ README.md # Navigation hub
โโโ overview/
โ โโโ index.md # Project overview
โ โโโ architecture.md # System design + diagrams
โ โโโ technology-stack.md # Tech stack
โโโ getting-started/
โ โโโ installation.md # Setup guide + flowchart
โ โโโ quick-start.md # Tutorial + sequence diagram
โ โโโ configuration.md # Config options
โโโ features/
โ โโโ index.md # Feature documentation + diagrams
โโโ architecture/
โโโ components.md # Deep dive + multiple diagrams
Configuration
Example .sourcescribe.yaml:
# LLM Provider Configuration
llm:
provider: "anthropic" # anthropic, openai, or ollama
model: "claude-3-5-sonnet-20241022"
temperature: 0.3
max_tokens: 4000
# Repository Settings
repository:
path: "."
exclude_patterns:
- "*.pyc"
- "__pycache__"
- "node_modules"
- ".git"
include_patterns:
- "*.py"
- "*.js"
- "*.ts"
- "*.java"
- "*.go"
# Documentation Output
output:
path: "./docs/generated"
format: "markdown"
include_diagrams: true
diagram_format: "mermaid"
# Watch Mode Settings
watch:
enabled: true
debounce_seconds: 2.0
batch_changes: true
# Documentation Style
style:
include_examples: true
include_architecture: true
include_api_docs: true
verbosity: "detailed" # minimal, normal, detailed
๐ Usage Examples
Generate Documentation with Anthropic Claude
# Using Claude 3 Haiku (fast and economical)
export ANTHROPIC_API_KEY="your-key-here"
sourcescribe generate . --provider anthropic --model claude-3-haiku-20240307
Generate for Docusaurus Site
# Output directly to Docusaurus docs folder
sourcescribe generate . --output ./website/docs/api-reference
Watch Mode with Custom Config
sourcescribe watch --config .sourcescribe.yaml
Use Local Ollama (No API Key Required)
# Install Ollama from https://ollama.ai
ollama serve
ollama pull llama2
sourcescribe generate . --provider ollama --model llama2
๐ Documentation Philosophy
Feature-Based, Not File-Based
SourceScribe generates documentation organized by features and workflows, not individual source files.
Before (File-Based):
โ docs/files/sourcescribe_cli.md
โ docs/files/sourcescribe_engine_generator.md
โ docs/files/sourcescribe_api_anthropic_provider.md
... (100+ files)
After (Feature-Based):
โ
Overview โ Architecture Overview โ Technology Stack
โ
Getting Started โ Installation โ Quick Start โ Configuration
โ
Features โ Documentation Generation โ LLM Integration
โ
Architecture โ Component Architecture (deep dive)
Diagram-First Approach
Every major section includes visual diagrams:
- Sequence Diagrams: Show workflows and interactions
- Flowcharts: Explain decision trees and processes
- Architecture Diagrams: Visualize system structure
- Class Diagrams: Document data models
๐๏ธ Architecture
SourceScribe consists of several key components:
- Engine: Core documentation generation with feature-based orchestration
- Feature Generator: Creates process-oriented docs with extensive diagrams
- Watch: File system monitoring and change detection
- API: LLM provider integrations (Anthropic, OpenAI, Ollama)
- Config: Pydantic-based configuration management
- Diagram Generator: Creates Mermaid.js visualizations
- Utils: Code analysis, parsing, and file handling
๐ How It Works
flowchart TD
A[Analyze Codebase] --> B[Identify Features]
B --> C[Build Context]
C --> D{Generate Sections}
D --> E[Overview]
D --> F[Getting Started]
D --> G[Features]
D --> H[Architecture]
E --> I[Create Diagrams]
F --> I
G --> I
H --> I
I --> J[Output Markdown]
- Code Analysis: Analyzes source files to extract structure, dependencies, and patterns
- Feature Identification: Groups functionality by features/capabilities
- Context Building: Builds rich context for each documentation section
- LLM Processing: Generates process-oriented documentation with diagram prompts
- Diagram Generation: Creates 10+ Mermaid.js diagrams throughout docs
- Section Organization: Structures docs by user journey (Overview โ Install โ Features)
- Output: Writes feature-based markdown with embedded diagrams
๐ธ Example Output
Generated Documentation Structure
When you run sourcescribe generate, you get a complete documentation site:
api-reference/
โโโ README.md # ๐ Navigation hub with quick links
โ
โโโ overview/
โ โโโ index.md # Project purpose, users, value props
โ โโโ architecture.md # ๐ System design + arch diagram + sequence diagram
โ โโโ technology-stack.md # Languages, frameworks, tools
โ
โโโ getting-started/
โ โโโ installation.md # ๐ Prerequisites + installation flowchart
โ โโโ quick-start.md # ๐ Tutorial + "what happened" sequence diagram
โ โโโ configuration.md # โ๏ธ All config options in tables
โ
โโโ features/
โ โโโ index.md # ๐ฏ Feature docs with process diagrams
โ
โโโ architecture/
โโโ components.md # ๐๏ธ Deep dive + multiple diagrams
Diagram Examples
Every section includes rich visual diagrams:
Quick Start (Sequence Diagram):
sequenceDiagram
User->>SourceScribe: generate_documentation()
SourceScribe->>Analyzer: Analyze codebase
Analyzer->>LLM: Generate docs
LLM->>SourceScribe: Return documentation
SourceScribe->>User: Display results
Installation (Flowchart):
flowchart TD
Start([Start]) --> Check{Python 3.7+?}
Check -->|No| Install[Install Python]
Check -->|Yes| Clone[Clone Repository]
Install --> Clone
Clone --> Deps[Install Dependencies]
Deps --> Keys[Set API Keys]
Keys --> Config[Create Config]
Config --> Test[Test Installation]
Test --> End([Ready!])
Architecture (Component Diagram): Shows the full system architecture with module dependencies and data flow.
๐จ Integration with Docusaurus
SourceScribe works seamlessly with Docusaurus and automatically generates the sidebar configuration!
# Generate docs for Docusaurus
sourcescribe generate . --output ./website/docs/api-reference
# Sidebar is auto-generated! Just build and start
cd website && npm start
โจ Auto-Generated Configuration
SourceScribe automatically configures Docusaurus based on your GitHub repository:
1. Sidebar Generation - Creates sidebars.ts matching your docs structure
2. Config Update - Updates docusaurus.config.ts with your GitHub org/repo
What Gets Updated:
// docusaurus.config.ts
organizationName: 'Source-Scribe', // Auto-detected from GitHub URL
projectName: 'sourcescribe-core', // Auto-detected from GitHub URL
Your Docusaurus sidebar will show:
Documentation Home
โโ Overview
โ โโ Project Overview
โ โโ Architecture Overview
โ โโ Technology Stack
โโ Getting Started
โ โโ Installation
โ โโ Quick Start
โ โโ Configuration
โโ Features
โ โโ All Features
โโ Architecture
โโ Component Architecture
All Mermaid diagrams render beautifully with zoom support!
๐ Use in Other Projects
SourceScribe works in any project's GitHub Actions! See GITHUB_ACTIONS_SETUP.md for complete setup guide.
Quick Example:
# .github/workflows/docs.yml
- name: Install SourceScribe
run: pip install sourcescribe
- name: Generate Documentation
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
sourcescribe generate . \
--output ./website/docs/api-reference \
--provider anthropic \
--model claude-3-haiku-20240307
What Gets Auto-Generated:
- โ Feature-based documentation structure
- โ 10+ Mermaid diagrams
- โ
Docusaurus
sidebars.ts(automatic!) - โ Docusaurus config updated (organizationName, projectName)
- โ GitHub permalinks to actual code
- โ Navigation README
GitHub Pages Deployment
To deploy your documentation to GitHub Pages:
-
Enable GitHub Pages in your repository settings:
- Go to Settings โ Pages
- Source: Deploy from a branch
- Branch:
gh-pages/(root)
-
Repository Requirements:
- โ Public repositories: GitHub Pages is available by default
- โ ๏ธ Private repositories: Requires GitHub Pro, Team, or Enterprise plan
Note: If your repository is private and you're on the Free plan, you'll need to either:
- Make your repository public, OR
- Upgrade to GitHub Pro/Team/Enterprise to enable Pages for private repos
-
Automatic Deployment:
- Once enabled, the
.github/workflows/build-docs.ymlworkflow will automatically deploy on every push tomain - Your site will be available at:
https://[username].github.io/[repo-name]/
- Once enabled, the
Development
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
pytest tests/
# Format code
black sourcescribe/
# Type checking
mypy sourcescribe/
# Linting
ruff check sourcescribe/
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License - see LICENSE file for details.
๐ Why SourceScribe?
vs Manual Documentation
- โ Always up-to-date: Regenerate docs with one command
- โ Consistent: LLM ensures uniform style and structure
- โ Comprehensive: Never miss documenting a feature
- โ Visual: Auto-generates diagrams you'd never draw manually
vs File-Based Tools (JSDoc, Sphinx, etc.)
- โ Feature-focused: Organized by what users want to do
- โ Process-oriented: Explains workflows, not just APIs
- โ User-centric: Written for developers using the system
- โ Rich diagrams: 10+ visual explanations per project
vs README-only Projects
- โ Structured: Clear sections with progressive disclosure
- โ Complete: Installation, features, architecture all covered
- โ Navigable: Easy to find specific information
- โ Scalable: Works for projects of any size
๐ฆ Supported LLM Providers
| Provider | Models | API Key Required | Cost |
|---|---|---|---|
| Anthropic | Claude 3 Haiku, Sonnet, Opus | โ Yes | $$ |
| OpenAI | GPT-4, GPT-4 Turbo | โ Yes | $$$ |
| Ollama | Llama 2, Mistral, CodeLlama | โ No (local) | Free |
Recommended: Use Claude 3 Haiku for best balance of speed, quality, and cost.
๐บ๏ธ Roadmap
- Support for more diagram types (state, entity-relationship)
- Custom feature templates
- Multi-language documentation output
- GitHub Actions integration
- VS Code extension
- API documentation from OpenAPI specs
- Incremental regeneration (only changed features)
Acknowledgments
- Inspired by CodeWiki
- Research paper: arXiv:2510.24428v2
- Documentation philosophy inspired by Devin.ai and Stripe Docs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sourcescribe-1.2.2.tar.gz.
File metadata
- Download URL: sourcescribe-1.2.2.tar.gz
- Upload date:
- Size: 337.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9fbe8b26b9b9c9185d6138ea7f75582814f06ff4f49295863ae23fd7c6434c25
|
|
| MD5 |
79dfa7aff58aada4c96226d4a83eb9ef
|
|
| BLAKE2b-256 |
af55bfa3abff8912aefe8a963463b384800a012d89b450e9decb85c5a23fa2b5
|
File details
Details for the file sourcescribe-1.2.2-py3-none-any.whl.
File metadata
- Download URL: sourcescribe-1.2.2-py3-none-any.whl
- Upload date:
- Size: 50.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b78dbe4e06038a215d2b13d883c1083ae379aa38aa2eb29f768ec630dae4d77
|
|
| MD5 |
16f46c4440d72b1970e183e5157ab2e3
|
|
| BLAKE2b-256 |
74862fa0892529ec73b60292f77d952aba64b5248ec72ade1e15ce29a9cc4671
|