Skip to main content

Convert GitHub README files into structured multi-format documentation artifacts.

Project description

📚 OpenDocs

OpenDocs Banner

Convert GitHub READMEs, Markdown files, and Jupyter Notebooks into structured, multi-format documentation — instantly.

PyPI License: MIT Python 3.10+ Open Source Contributions Welcome PRs Welcome VS Code Extension


#opensource  |  #documentation  |  #markdown  |  #ai  |  #devtools  |  #python  |  #jupyter  |  #llm

What is OpenDocs?

OpenDocs (by ioteverythin) takes a GitHub repository README, local Markdown file, or Jupyter Notebook (.ipynb) and automatically generates beautiful, professional documentation in multiple formats:

Output Format Status
Technical Report .docx (Word) Available
Executive Deck .pptx (PowerPoint) Available
PDF Documentation .pdf Available
Blog Post .md (SEO-ready) Available
Jira Tickets .json (Epic + Stories) Available
Changelog / Release Notes .md Available
Academic Paper .tex (LaTeX / IEEE) Available
One-Pager / Datasheet .pdf (executive) Available
Social Cards .json (OG + posts) Available
FAQ Document .md Available
Analysis Report .md (Markdown) Available
Architecture Diagrams .mmd + .png (5 views) Available
Mermaid Diagrams PNG rendering Available
Knowledge Graph Entity extraction Available
LLM Summaries Stakeholder views Available

What's New in v0.5.0

  • Jupyter Notebook Ingestion -- Parse .ipynb files and convert markdown cells, code cells, and outputs into polished reports
  • Parameterized Report Templates -- Inject project name, author, version, date, and organisation into document headers, footers, and title pages via --config YAML/JSON or CLI flags
  • File Watcher + Auto-PR -- opendocs watch daemon monitors repos for changes and auto-regenerates docs; supports cron mode (--once) and automatic pull requests (--auto-pr)
  • 5 LLM Providers -- OpenAI, Anthropic (Claude), Google (Gemini), Ollama (local), Azure OpenAI
  • 25 Built-in Themes -- 15 original + 10 new modern themes (Aurora, Carbon, Lavender, Graphite, Obsidian, Coral, Zen, Nebula, Sand, Glacier)

Two Engines

1. Pipeline (Deterministic + LLM)

The core pipeline parses Markdown/Notebooks and generates all 11 output formats:

  • Basic mode -- Pure Markdown AST parsing, no LLM required. Fast, free, predictable.
  • LLM mode -- Uses any supported LLM provider to extract entities, build knowledge graphs, and generate executive summaries + stakeholder views (CTO, Investor, Developer).

2. DocAgent (Agentic)

A full LangGraph-powered agent that generates 8 enterprise document types (PRD, Proposal, SOP, Report, Slides, Changelog, Onboarding, Tech Debt) from any GitHub repo.

Quick Start

Install from PyPI

pip install opendocs

For LLM features:

pip install opendocs[llm]

For all LLM providers:

pip install opendocs[all-providers]

For YAML config file support:

pip install opendocs[templates]

Install from source

git clone https://github.com/ioteverythin/OpenDocs.git
cd OpenDocs
pip install -e ".[dev,llm]"

Basic Usage

# Generate all formats from a GitHub README
opendocs generate https://github.com/owner/repo

# Generate specific format with a theme
opendocs generate https://github.com/owner/repo --format word --theme aurora

# From a local Markdown file
opendocs generate ./README.md --local

# LLM mode with knowledge graph + stakeholder summaries
opendocs generate ./README.md --local --mode llm --api-key sk-...

# Use Claude instead of OpenAI
opendocs generate ./README.md --local --mode llm --provider anthropic

# List available themes (25 themes)
opendocs themes

Jupyter Notebook Ingestion

Generate polished reports from research notebooks and data-science projects:

# Generate docs from a Jupyter Notebook
opendocs generate ./analysis.ipynb --local

# Generate only Word report from notebook
opendocs generate ./research.ipynb --local --format word --theme carbon

# Exclude cell outputs
opendocs generate ./notebook.ipynb --local --no-outputs

The notebook parser extracts:

  • Markdown cells -- parsed into headings, paragraphs, lists, tables, etc.
  • Code cells -- preserved with language detection and execution count
  • Cell outputs -- text output, images (PNG/SVG/JPEG as data URIs), HTML previews, error tracebacks

Parameterized Report Templates

Inject variables into document headers, footers, and title pages:

# Via CLI flags
opendocs generate ./README.md --local \
  --project-name "My Project" \
  --author "Jane Doe" \
  --doc-version "2.1.0" \
  --org "Acme Corp" \
  --department "Engineering" \
  --confidentiality "Internal"

# Via YAML/JSON config file
opendocs generate ./README.md --local --config opendocs.yaml

Example opendocs.yaml:

project_name: "My Project"
author: "Jane Doe"
version: "2.1.0"
date: "2026-02-28"
organisation: "Acme Corp"
department: "Engineering"
confidentiality: "Internal"
custom:
  reviewer: "John Smith"
  status: "Draft"

These values automatically appear in:

  • Word (.docx) -- document header, footer, and expanded metadata table on title page
  • PowerPoint (.pptx) -- title slide footer with org, author, version, and date
  • PDF -- inherits from Word generator

File Watcher + Auto-PR

Monitor a repository for changes and auto-regenerate documentation:

# Continuous watch (checks every 30 seconds)
opendocs watch ./my-repo

# One-shot mode for cron jobs
opendocs watch ./my-repo --once

# Watch + auto-create pull requests
opendocs watch ./my-repo --auto-pr --branch docs-update

# Custom interval and file patterns
opendocs watch ./my-repo --interval 60 --patterns "README.md,docs/*.md,*.ipynb"

Cron integration -- add to crontab for hourly checks:

0 * * * * cd /path/to/repo && opendocs watch . --once --auto-pr

How it works:

  1. Discovers files matching watch patterns (README.md, CHANGELOG.md, docs/**/*.md, *.ipynb)
  2. Computes SHA-256 hashes and compares against saved state (.opendocs-watch-state.json)
  3. If changes detected: runs the full pipeline for each changed file
  4. If --auto-pr: creates a timestamped git branch, commits outputs, pushes, and opens a PR via GitHub CLI (gh)

Multi-LLM Provider Support

Use any of the 5 supported LLM providers:

# OpenAI (default)
opendocs generate ./README.md --local --mode llm --provider openai --api-key sk-...

# Anthropic Claude
opendocs generate ./README.md --local --mode llm --provider anthropic

# Google Gemini
opendocs generate ./README.md --local --mode llm --provider google

# Ollama (local, no API key needed)
opendocs generate ./README.md --local --mode llm --provider ollama

# Azure OpenAI
opendocs generate ./README.md --local --mode llm --provider azure --base-url https://your-resource.openai.azure.com/
Provider Models Env Variable
openai gpt-4o-mini (default), gpt-4o, etc. OPENAI_API_KEY
anthropic claude-sonnet-4-20250514, claude-3-haiku, etc. ANTHROPIC_API_KEY
google gemini-1.5-flash (default), gemini-pro, etc. GOOGLE_API_KEY
ollama llama3.1 (default), any local model None (local)
azure Any Azure-deployed model AZURE_OPENAI_API_KEY

Format Flags Reference

Use -f / --format to generate only what you need:

Flag Output File
word Word document .docx
pdf PDF document .pdf
pptx PowerPoint deck .pptx
blog SEO blog post .md (with front-matter)
jira Jira tickets (Epic + Stories) .json
changelog Release notes .md
latex IEEE-style academic paper .tex
onepager Executive one-pager .pdf
social Social cards + post text .json (OG, Twitter, LinkedIn, Reddit)
faq FAQ document .md
architecture Architecture diagrams (5 views) .mmd + .png + .md report
all Everything above (default) all formats

25 Built-in Themes

Category Themes
Classic corporate, ocean, sunset, dark, minimal, emerald, royal
Professional slate, rose, nordic, cyber, terracotta, sapphire, mint, monochrome
Modern aurora, carbon, lavender, graphite, obsidian, coral, zen, nebula, sand, glacier
# List all themes with color previews
opendocs themes

Python API

from opendocs.pipeline import Pipeline
from opendocs.core.models import OutputFormat
from opendocs.core.template_vars import TemplateVars

# Basic usage
pipeline = Pipeline()
pipeline.run("https://github.com/owner/repo", theme_name="aurora")

# From a Jupyter Notebook with template variables
tvars = TemplateVars(
    project_name="Q4 Analysis",
    author="Data Team",
    version="1.0",
    organisation="Acme Corp",
)
pipeline.run(
    "./notebook.ipynb",
    local=True,
    formats=[OutputFormat.WORD, OutputFormat.PDF],
    template_vars=tvars,
)

# LLM mode with Claude
pipeline.run(
    "./README.md",
    local=True,
    mode="llm",
    api_key="sk-ant-...",
    provider="anthropic",
)

Features

  • 11 Output Formats -- Word, PDF, PPTX, Blog Post, Jira Tickets, Changelog, LaTeX Paper, One-Pager PDF, Social Cards, FAQ, Architecture Diagrams
  • Jupyter Notebook Support -- Parse .ipynb files including markdown cells, code cells, and outputs (images, tables, text)
  • Parameterized Templates -- Inject project name, author, version, org, date into headers/footers via config file or CLI
  • File Watcher + Auto-PR -- Monitor repos for changes, auto-regenerate docs, and create pull requests
  • 5 LLM Providers -- OpenAI, Anthropic (Claude), Google (Gemini), Ollama (local), Azure OpenAI
  • 25 Built-in Themes -- Classic, Professional, and Modern theme categories
  • Smart Table Sorting -- 6 strategies (smart, alpha, numeric, column:N, column:N:desc, none)
  • Knowledge Graph -- Extracts 10+ entity types (projects, technologies, APIs, metrics, etc.)
  • Architecture Diagrams -- 5 auto-generated views: System Architecture (C4-style), Tech Stack Layers, Data Flow, Dependency Tree, Deployment View
  • Mermaid -> PNG -- Renders mermaid diagrams to images via mermaid.ink API
  • LLM Summaries -- Executive summary + CTO / Investor / Developer stakeholder views

Architecture

GitHub URL / Local .md / .ipynb
        |
        v
+-------------------+
|  README Fetch /   |  <-- httpx + GitHub API
|  Notebook Parser  |  <-- .ipynb cell extraction
+--------+----------+
         v
+-------------------+
|  Markdown Parser  |  <-- mistune 3.x AST
+--------+----------+
         v
+-------------------+
|  Template Vars    |  <-- YAML/JSON config or CLI flags
+--------+----------+
         v
+-------------------+
|  Table Sorting    |  <-- 6 strategies
+--------+----------+
         v
+-------------------+
|  KG Extraction    |  <-- Semantic + optional LLM (5 providers)
+--------+----------+
         v
+-------------------+
|  Diagram Renderer |  <-- mermaid.ink API
+--------+----------+
         |
    +----+----+----+----+------+------+-------+------+-----+------+------+
    v    v    v    v    v      v      v       v      v     v      v
  Word  PDF  PPTX  Blog  Jira  Change  LaTeX  1-Pgr  Social  FAQ  Arch
                                 log                              Diag

File Watcher Flow

opendocs watch ./repo
        |
        v
  Discover watched files (README.md, *.ipynb, docs/)
        |
        v
  SHA-256 hash each file
        |
        v
  Compare against .opendocs-watch-state.json
        |
        v
  If changed --> Pipeline.run() for each file
        |
        v
  Update state file
        |
        v
  If --auto-pr --> git branch + commit + push + gh pr create

Optional Dependencies

pip install opendocs[llm]             # OpenAI LLM features
pip install opendocs[anthropic]       # Claude support
pip install opendocs[google]          # Gemini support
pip install opendocs[all-providers]   # All LLM providers
pip install opendocs[templates]       # YAML config file support
pip install opendocs[agents]          # DocAgent (agentic system)

Development

# Install dev dependencies
pip install -e ".[dev,llm,templates]"

# Run tests
pytest

# Lint
ruff check src/

Contributing

Contributions are welcome! Please open issues and PRs on GitHub.

License

MIT License -- see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opendocs-0.9.0.tar.gz (199.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

opendocs-0.9.0-py3-none-any.whl (215.2 kB view details)

Uploaded Python 3

File details

Details for the file opendocs-0.9.0.tar.gz.

File metadata

  • Download URL: opendocs-0.9.0.tar.gz
  • Upload date:
  • Size: 199.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for opendocs-0.9.0.tar.gz
Algorithm Hash digest
SHA256 20a5f1f2cb97ed03af05b6f6a75420df50474911c32255ea1a9987010e2445c4
MD5 aed8f27a13165ea0760045e99161d8bc
BLAKE2b-256 03d4115362f45312a889dc7f75851e6e6492f4fa292ba0997c4dac9a29fe7c95

See more details on using hashes here.

File details

Details for the file opendocs-0.9.0-py3-none-any.whl.

File metadata

  • Download URL: opendocs-0.9.0-py3-none-any.whl
  • Upload date:
  • Size: 215.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for opendocs-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 640bb7c423b55dfcd4336dafb3764860b50e518e5ca3e6c438c5ad41f91897f7
MD5 0e6a676b122d93986108f12e68b00292
BLAKE2b-256 3d70939c4fb5754547e068ca8d1a4e9ffe92db712cb92221355477040d8b83ce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page