Skip to main content

Whole-repo documentation generator (README, AGENTS.md, CLAUDE.md, how-to guides, API reference, llms.txt) with deterministic verifier and citation grounding. Not affiliated with facebookresearch/DocAgent (Meta AI, arXiv 2504.08725).

Project description

docagent

Repository documentation agent for humans and coding agents.

Not affiliated with facebookresearch/DocAgent (Meta AI, arXiv 2504.08725, ACL 2025). That project is a multi-agent pipeline for generating Python docstrings. This project is a single-agent CLI that generates whole-repository documentation (README, AGENTS.md, CLAUDE.md, how-to guides, API reference, llms.txt) with a deterministic verifier and citation-grounded output. See How we differ below.

Why

Repositories accumulate stale READMEs, missing how-to guides, and undocumented APIs faster than humans can maintain them, and AI coding assistants increasingly need their own orientation files (AGENTS.md, CLAUDE.md, llms.txt) alongside the human-facing docs. DocAgent treats documentation as a set of verifiable artifacts driven by a DAG, generates them with an LLM backend, and verifies every claim against the actual source via a deterministic-first pipeline. The verifier is built around ground-citation comments (<!-- ground: path:start-end -->) so generated docs stay anchored to real code.

Install

DocAgent targets Python 3.11+ and is distributed on PyPI as the docagent-ai package (the docagent name itself is blocked by PyPI's similarity policy due to neighbours like doc-agent and docs-agent; the imported module and the CLI binary remain docagent).

pip install docagent-ai

For the multi-provider backend (Gemini, OpenRouter, Anthropic-direct via LiteLLM), install the optional multi extra:

pip install 'docagent-ai[multi]'

Provider setup

DocAgent ships two backends. The default is the Claude Agent SDK (agent_sdk), which delegates to your local claude CLI.

# Default backend — uses your existing `claude` CLI session.
docagent init

The opt-in LiteLLM backend routes to Gemini, OpenRouter, Anthropic-direct, or OpenAI based on the --model string and requires the corresponding provider API key:

Provider Model string example Env var
Anthropic (direct) anthropic/claude-sonnet-4-6 ANTHROPIC_API_KEY
Google Gemini gemini/gemini-2.5-flash, gemini/gemini-2.5-pro GEMINI_API_KEY
OpenRouter openrouter/anthropic/claude-sonnet-4-6 OPENROUTER_API_KEY
export GEMINI_API_KEY=...
docagent init --backend litellm --model gemini/gemini-2.5-pro

--backend litellm without --model exits with a multi-line hint listing the supported routing strings.

Quickstart

Installation registers a single docagent console entry point. The Typer app exposes three commands: init, update, and verify.

# Full pass: scan repo, build symbol index, generate all artifacts.
docagent init

# Incremental refresh of artifacts affected by recent changes.
docagent update

# Re-run the deterministic-first verifier against artifacts on disk.
docagent verify

Useful flags on init/update: --dry-run previews diffs without writing, --only <artifact_id> restricts the run to specific artifacts, and --backend litellm --model <provider/model> swaps the default Claude Agent SDK backend for LiteLLM.

Architecture

DocAgent is organized into orthogonal packages under docagent/:

  • docagent.cli — Typer-based entry point for init / update / verify; selects a backend and constructs the orchestrator.
  • docagent.artifacts — Each artifact (readme, api_reference, how_to_guides, agents_md, claude_md, llms_txt, python_docstrings) owns its own plan → generate → verify cycle and registers into the DAG via register_v1_builtins.
  • docagent.core — Orchestrator, scanner, diff/state tracking, and the BudgetTracker drive the DAG; CLI imports these directly.
  • docagent.adapters + docagent.parser — Language adapters using libcst/jedi for Python and tree-sitter for Rust/Go/TypeScript/Java/C++ extract symbols and signatures.
  • docagent.index — SQLite-backed symbol/mention store opened via open_store; populated by the scanner during init.
  • docagent.backends — Pluggable LLM backends: agent_sdk (default, Claude Agent SDK) and litellm (multi-provider; --model required).
  • docagent.verify — Deterministic-first verification pipeline backing the verify command and the per-artifact verify phase.

Status

Beta. The artifact pipeline runs end-to-end and self-verifies through the deterministic gates; the package version is 1.0.4 and APIs may still change between minor releases.

How we differ

This project shares the name "DocAgent" with Meta AI's facebookresearch/DocAgent (arXiv 2504.08725, ACL 2025 demo) but is a separate, unaffiliated project with a different scope:

Axis Meta's DocAgent This project
Scope Python docstrings only, symbol-level Whole-repo artifacts: README, AGENTS.md, CLAUDE.md, how-to guides, API reference, llms.txt, plus docstrings
Topology 5-agent pipeline (Navigator → Reader / Searcher / Writer / Verifier / Orchestrator) Single-agent orchestrator over the Claude Agent SDK
Verification LLM "Verifier" agent Deterministic-first gate pipeline (ground-citation validation, markdownlint, structure, secrets)
Indexing In-process dependency graph Persisted SQLite symbol index (libcst + tree-sitter)
Grounding Hierarchical context build <!-- ground: path:line-start-line-end --> citation enforcement on every non-trivial claim
Distribution Research repo, clone-and-run pip install-able CLI: docagent init / update / verify

If you arrived here looking for the Meta paper's reference implementation, you want facebookresearch/DocAgent.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docagent_ai-1.0.4.tar.gz (644.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docagent_ai-1.0.4-py3-none-any.whl (113.5 kB view details)

Uploaded Python 3

File details

Details for the file docagent_ai-1.0.4.tar.gz.

File metadata

  • Download URL: docagent_ai-1.0.4.tar.gz
  • Upload date:
  • Size: 644.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for docagent_ai-1.0.4.tar.gz
Algorithm Hash digest
SHA256 570b9dfc6a8dd0c77bd3496b2f02809b4cd09f19dbfc662d9d39abff06277b0d
MD5 a1ef4f4cac73eeee9e4eba6a5c28a1fe
BLAKE2b-256 2d7380b29f5e0d370f01bfdcbad4a375be062404d26c1b0f6334cb4c9f3ecbf1

See more details on using hashes here.

File details

Details for the file docagent_ai-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: docagent_ai-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 113.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for docagent_ai-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9dc2d38bdaad0f38071677ff73fd91bd1f1b096abb9decf15d83cd155ae2c1e3
MD5 cbbb32f2048b39b55acdcdf4846186a6
BLAKE2b-256 4978276eb6833b5241d20b788dbd2c11af60b55f0d6cc1285de8ffbafb29cd59

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page