Skip to main content

Auto-documentation engine using LLMs to generate insightful documentation with architecture diagrams

Project description

SourceScribe

Python 3.7+ License: MIT Documentation

An intelligent auto-documentation engine that generates feature-based, process-oriented documentation with extensive visual diagrams.

Powered by LLMs (Claude, OpenAI, Ollama) and designed for developers who want documentation that explains how to USE the system, not just browse source files.

๐ŸŽฏ Different from other doc tools: SourceScribe organizes docs by features & workflows with 10+ diagrams, not by individual source files.


๐Ÿ“‘ Table of Contents


โœจ Key Features

  • ๐ŸŽฏ Feature-Based Documentation: Organizes by capabilities and workflows, not file structure
  • ๐Ÿ“Š Diagram-Rich: Generates 10+ Mermaid diagrams (sequence, flowchart, architecture, class)
  • ๐Ÿ”„ Process-Oriented: Explains "How it Works" with visual workflows
  • ๐Ÿš€ User-Centric: Written for developers who want to USE the system
  • ๐Ÿ”— GitHub Permalinks: Automatically links to actual code with line-level precision
  • โœจ Auto-Sidebar Generation: Automatically generates Docusaurus sidebars.ts - no manual config!
  • ๐Ÿค– Multi-LLM Support: Claude (Anthropic), OpenAI (GPT-4), and Ollama
  • ๐Ÿ‘๏ธ Real-time Watching: Monitors code changes and auto-regenerates docs
  • ๐ŸŒ Multi-language: Supports Python, TypeScript, Java, Go, Rust, and more
  • โš™๏ธ Configurable: Flexible YAML-based configuration with Pydantic models
  • ๐Ÿ”„ Cross-platform: Works on macOS, Linux, and Windows
  • ๐Ÿšข GitHub Actions Ready: Works seamlessly in any project's CI/CD pipeline

Installation

# Clone the repository
git clone https://github.com/source-scribe/sourcescribe-core.git
cd sourcescribe-core

# Install dependencies
pip install -r requirements.txt

# Or install in development mode
pip install -e .

Quick Start

1. Configure API Keys

Set up your LLM API keys as environment variables:

export ANTHROPIC_API_KEY="your-anthropic-key"
export OPENAI_API_KEY="your-openai-key"
# Ollama runs locally, no key needed

2. Initialize a Project

sourcescribe init /path/to/your/project

This creates a .sourcescribe.yaml configuration file.

3. Generate Documentation

# Generate feature-based documentation
sourcescribe generate .

# Specify output directory
sourcescribe generate . --output ./docs/api-reference

# Use specific LLM provider
sourcescribe generate . --provider anthropic --model claude-3-haiku-20240307

# Watch mode (auto-regenerate on changes)
sourcescribe watch .

4. View Your Documentation

SourceScribe generates a feature-based documentation structure:

docs/
โ”œโ”€โ”€ README.md                    # Navigation hub
โ”œโ”€โ”€ overview/
โ”‚   โ”œโ”€โ”€ index.md                 # Project overview
โ”‚   โ”œโ”€โ”€ architecture.md          # System design + diagrams
โ”‚   โ””โ”€โ”€ technology-stack.md      # Tech stack
โ”œโ”€โ”€ getting-started/
โ”‚   โ”œโ”€โ”€ installation.md          # Setup guide + flowchart
โ”‚   โ”œโ”€โ”€ quick-start.md           # Tutorial + sequence diagram
โ”‚   โ””โ”€โ”€ configuration.md         # Config options
โ”œโ”€โ”€ features/
โ”‚   โ””โ”€โ”€ index.md                 # Feature documentation + diagrams
โ””โ”€โ”€ architecture/
    โ””โ”€โ”€ components.md            # Deep dive + multiple diagrams

Configuration

Example .sourcescribe.yaml:

# LLM Provider Configuration
llm:
  provider: "anthropic"  # anthropic, openai, or ollama
  model: "claude-3-5-sonnet-20241022"
  temperature: 0.3
  max_tokens: 4000

# Repository Settings
repository:
  path: "."
  exclude_patterns:
    - "*.pyc"
    - "__pycache__"
    - "node_modules"
    - ".git"
  include_patterns:
    - "*.py"
    - "*.js"
    - "*.ts"
    - "*.java"
    - "*.go"

# Documentation Output
output:
  path: "./docs/generated"
  format: "markdown"
  include_diagrams: true
  diagram_format: "mermaid"

# Watch Mode Settings
watch:
  enabled: true
  debounce_seconds: 2.0
  batch_changes: true

# Documentation Style
style:
  include_examples: true
  include_architecture: true
  include_api_docs: true
  verbosity: "detailed"  # minimal, normal, detailed

๐Ÿ“– Usage Examples

Generate Documentation with Anthropic Claude

# Using Claude 3 Haiku (fast and economical)
export ANTHROPIC_API_KEY="your-key-here"
sourcescribe generate . --provider anthropic --model claude-3-haiku-20240307

Generate for Docusaurus Site

# Output directly to Docusaurus docs folder
sourcescribe generate . --output ./website/docs/api-reference

Watch Mode with Custom Config

sourcescribe watch --config .sourcescribe.yaml

Use Local Ollama (No API Key Required)

# Install Ollama from https://ollama.ai
ollama serve
ollama pull llama2

sourcescribe generate . --provider ollama --model llama2

๐Ÿ“š Documentation Philosophy

Feature-Based, Not File-Based

SourceScribe generates documentation organized by features and workflows, not individual source files.

Before (File-Based):

โŒ docs/files/sourcescribe_cli.md
โŒ docs/files/sourcescribe_engine_generator.md
โŒ docs/files/sourcescribe_api_anthropic_provider.md
... (100+ files)

After (Feature-Based):

โœ… Overview โ†’ Architecture Overview โ†’ Technology Stack
โœ… Getting Started โ†’ Installation โ†’ Quick Start โ†’ Configuration
โœ… Features โ†’ Documentation Generation โ†’ LLM Integration
โœ… Architecture โ†’ Component Architecture (deep dive)

Diagram-First Approach

Every major section includes visual diagrams:

  • Sequence Diagrams: Show workflows and interactions
  • Flowcharts: Explain decision trees and processes
  • Architecture Diagrams: Visualize system structure
  • Class Diagrams: Document data models

๐Ÿ—๏ธ Architecture

SourceScribe consists of several key components:

  • Engine: Core documentation generation with feature-based orchestration
  • Feature Generator: Creates process-oriented docs with extensive diagrams
  • Watch: File system monitoring and change detection
  • API: LLM provider integrations (Anthropic, OpenAI, Ollama)
  • Config: Pydantic-based configuration management
  • Diagram Generator: Creates Mermaid.js visualizations
  • Utils: Code analysis, parsing, and file handling

๐Ÿ”„ How It Works

flowchart TD
    A[Analyze Codebase] --> B[Identify Features]
    B --> C[Build Context]
    C --> D{Generate Sections}
    D --> E[Overview]
    D --> F[Getting Started]
    D --> G[Features]
    D --> H[Architecture]
    E --> I[Create Diagrams]
    F --> I
    G --> I
    H --> I
    I --> J[Output Markdown]
  1. Code Analysis: Analyzes source files to extract structure, dependencies, and patterns
  2. Feature Identification: Groups functionality by features/capabilities
  3. Context Building: Builds rich context for each documentation section
  4. LLM Processing: Generates process-oriented documentation with diagram prompts
  5. Diagram Generation: Creates 10+ Mermaid.js diagrams throughout docs
  6. Section Organization: Structures docs by user journey (Overview โ†’ Install โ†’ Features)
  7. Output: Writes feature-based markdown with embedded diagrams

๐Ÿ“ธ Example Output

Generated Documentation Structure

When you run sourcescribe generate, you get a complete documentation site:

api-reference/
โ”œโ”€โ”€ README.md                           # ๐Ÿ  Navigation hub with quick links
โ”‚
โ”œโ”€โ”€ overview/
โ”‚   โ”œโ”€โ”€ index.md                        # Project purpose, users, value props
โ”‚   โ”œโ”€โ”€ architecture.md                 # ๐Ÿ“Š System design + arch diagram + sequence diagram  
โ”‚   โ””โ”€โ”€ technology-stack.md             # Languages, frameworks, tools
โ”‚
โ”œโ”€โ”€ getting-started/
โ”‚   โ”œโ”€โ”€ installation.md                 # ๐Ÿ“‹ Prerequisites + installation flowchart
โ”‚   โ”œโ”€โ”€ quick-start.md                  # ๐Ÿš€ Tutorial + "what happened" sequence diagram
โ”‚   โ””โ”€โ”€ configuration.md                # โš™๏ธ All config options in tables
โ”‚
โ”œโ”€โ”€ features/
โ”‚   โ””โ”€โ”€ index.md                        # ๐ŸŽฏ Feature docs with process diagrams
โ”‚
โ””โ”€โ”€ architecture/
    โ””โ”€โ”€ components.md                   # ๐Ÿ—๏ธ Deep dive + multiple diagrams

Diagram Examples

Every section includes rich visual diagrams:

Quick Start (Sequence Diagram):

sequenceDiagram
    User->>SourceScribe: generate_documentation()
    SourceScribe->>Analyzer: Analyze codebase
    Analyzer->>LLM: Generate docs
    LLM->>SourceScribe: Return documentation
    SourceScribe->>User: Display results

Installation (Flowchart):

flowchart TD
    Start([Start]) --> Check{Python 3.7+?}
    Check -->|No| Install[Install Python]
    Check -->|Yes| Clone[Clone Repository]
    Install --> Clone
    Clone --> Deps[Install Dependencies]
    Deps --> Keys[Set API Keys]
    Keys --> Config[Create Config]
    Config --> Test[Test Installation]
    Test --> End([Ready!])

Architecture (Component Diagram): Shows the full system architecture with module dependencies and data flow.

๐ŸŽจ Integration with Docusaurus

SourceScribe works seamlessly with Docusaurus and automatically generates the sidebar configuration!

# Generate docs for Docusaurus
sourcescribe generate . --output ./website/docs/api-reference

# Sidebar is auto-generated! Just build and start
cd website && npm start

โœจ Auto-Generated Configuration

SourceScribe automatically configures Docusaurus based on your GitHub repository:

1. Sidebar Generation - Creates sidebars.ts matching your docs structure 2. Config Update - Updates docusaurus.config.ts with your GitHub org/repo

What Gets Updated:

// docusaurus.config.ts
organizationName: 'Source-Scribe',  // Auto-detected from GitHub URL
projectName: 'sourcescribe-core',    // Auto-detected from GitHub URL

Your Docusaurus sidebar will show:

Documentation Home
โ”œโ”€ Overview
โ”‚  โ”œโ”€ Project Overview  
โ”‚  โ”œโ”€ Architecture Overview
โ”‚  โ””โ”€ Technology Stack
โ”œโ”€ Getting Started
โ”‚  โ”œโ”€ Installation
โ”‚  โ”œโ”€ Quick Start
โ”‚  โ””โ”€ Configuration
โ”œโ”€ Features
โ”‚  โ””โ”€ All Features
โ””โ”€ Architecture
   โ””โ”€ Component Architecture

All Mermaid diagrams render beautifully with zoom support!

๐Ÿš€ Use in Other Projects

SourceScribe works in any project's GitHub Actions! See GITHUB_ACTIONS_SETUP.md for complete setup guide.

Quick Example:

# .github/workflows/docs.yml
- name: Install SourceScribe
  run: pip install sourcescribe

- name: Generate Documentation
  env:
    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
  run: |
    sourcescribe generate . \
      --output ./website/docs/api-reference \
      --provider anthropic \
      --model claude-3-haiku-20240307

What Gets Auto-Generated:

  • โœ… Feature-based documentation structure
  • โœ… 10+ Mermaid diagrams
  • โœ… Docusaurus sidebars.ts (automatic!)
  • โœ… Docusaurus config updated (organizationName, projectName)
  • โœ… GitHub permalinks to actual code
  • โœ… Navigation README

GitHub Pages Deployment

To deploy your documentation to GitHub Pages:

  1. Enable GitHub Pages in your repository settings:

    • Go to Settings โ†’ Pages
    • Source: Deploy from a branch
    • Branch: gh-pages / (root)
  2. Repository Requirements:

    • โœ… Public repositories: GitHub Pages is available by default
    • โš ๏ธ Private repositories: Requires GitHub Pro, Team, or Enterprise plan

    Note: If your repository is private and you're on the Free plan, you'll need to either:

    • Make your repository public, OR
    • Upgrade to GitHub Pro/Team/Enterprise to enable Pages for private repos
  3. Automatic Deployment:

    • Once enabled, the .github/workflows/build-docs.yml workflow will automatically deploy on every push to main
    • Your site will be available at: https://[username].github.io/[repo-name]/

Development

# Install development dependencies
pip install -r requirements-dev.txt

# Run tests
pytest tests/

# Format code
black sourcescribe/

# Type checking
mypy sourcescribe/

# Linting
ruff check sourcescribe/

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

๐ŸŒŸ Why SourceScribe?

vs Manual Documentation

  • โœ… Always up-to-date: Regenerate docs with one command
  • โœ… Consistent: LLM ensures uniform style and structure
  • โœ… Comprehensive: Never miss documenting a feature
  • โœ… Visual: Auto-generates diagrams you'd never draw manually

vs File-Based Tools (JSDoc, Sphinx, etc.)

  • โœ… Feature-focused: Organized by what users want to do
  • โœ… Process-oriented: Explains workflows, not just APIs
  • โœ… User-centric: Written for developers using the system
  • โœ… Rich diagrams: 10+ visual explanations per project

vs README-only Projects

  • โœ… Structured: Clear sections with progressive disclosure
  • โœ… Complete: Installation, features, architecture all covered
  • โœ… Navigable: Easy to find specific information
  • โœ… Scalable: Works for projects of any size

๐Ÿšฆ Supported LLM Providers

Provider Models API Key Required Cost
Anthropic Claude 3 Haiku, Sonnet, Opus โœ… Yes $$
OpenAI GPT-4, GPT-4 Turbo โœ… Yes $$$
Ollama Llama 2, Mistral, CodeLlama โŒ No (local) Free

Recommended: Use Claude 3 Haiku for best balance of speed, quality, and cost.

๐Ÿ—บ๏ธ Roadmap

  • Support for more diagram types (state, entity-relationship)
  • Custom feature templates
  • Multi-language documentation output
  • GitHub Actions integration
  • VS Code extension
  • API documentation from OpenAPI specs
  • Incremental regeneration (only changed features)

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourcescribe-1.6.0.tar.gz (356.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sourcescribe-1.6.0-py3-none-any.whl (59.5 kB view details)

Uploaded Python 3

File details

Details for the file sourcescribe-1.6.0.tar.gz.

File metadata

  • Download URL: sourcescribe-1.6.0.tar.gz
  • Upload date:
  • Size: 356.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for sourcescribe-1.6.0.tar.gz
Algorithm Hash digest
SHA256 a79a00f736e42a701a7106bf8dab3a59531f3112e6963e95dcb0984bf711a281
MD5 f09c8043b277745f72c295383af4e233
BLAKE2b-256 a78fe34f00efa462b5abef640294b795bfc65e44696690b2245626eb3f24ac5c

See more details on using hashes here.

File details

Details for the file sourcescribe-1.6.0-py3-none-any.whl.

File metadata

  • Download URL: sourcescribe-1.6.0-py3-none-any.whl
  • Upload date:
  • Size: 59.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for sourcescribe-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1d92d39825d4a2c58e28adb9002312d151a07a00b12757d36739d366247e0bcd
MD5 cfada2a8e8860e99c856ba1fb69333d9
BLAKE2b-256 2454c10002b43ad3001d1afe9ec47700559687c94fad65641320a85b402a3bcf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page