Whole-repo documentation generator (README, AGENTS.md, CLAUDE.md, how-to guides, API reference, llms.txt) with deterministic verifier and citation grounding. Not affiliated with facebookresearch/DocAgent (Meta AI, arXiv 2504.08725).
Project description
docagent
Repository documentation agent for humans and coding agents.
Not affiliated with facebookresearch/DocAgent (Meta AI, arXiv 2504.08725, ACL 2025). That project is a multi-agent pipeline for generating Python docstrings. This project is a single-agent CLI that generates whole-repository documentation (README, AGENTS.md, CLAUDE.md, how-to guides, API reference, llms.txt) with a deterministic verifier and citation-grounded output. See How we differ below.
Why
Repositories accumulate stale READMEs, missing how-to guides, and undocumented APIs faster than humans can maintain them, and AI coding assistants increasingly need their own orientation files (AGENTS.md, CLAUDE.md, llms.txt) alongside the human-facing docs. DocAgent treats documentation as a set of verifiable artifacts driven by a DAG, generates them with an LLM backend, and verifies every claim against the actual source via a deterministic-first pipeline. The verifier is built around ground-citation comments (<!-- ground: path:start-end -->) so generated docs stay anchored to real code.
Install
DocAgent targets Python 3.11+ and is distributed on PyPI as the docagent-ai package (the docagent name itself is blocked by PyPI's similarity policy due to neighbours like doc-agent and docs-agent; the imported module and the CLI binary remain docagent).
pip install docagent-ai
For the multi-provider backend (Gemini, OpenRouter, Anthropic-direct via LiteLLM), install the optional multi extra:
pip install 'docagent-ai[multi]'
Provider setup
DocAgent ships two backends. The default is the Claude Agent SDK (agent_sdk), which delegates to your local claude CLI.
# Default backend — uses your existing `claude` CLI session.
docagent init
The opt-in LiteLLM backend routes to Gemini, OpenRouter, Anthropic-direct, or OpenAI based on the --model string and requires the corresponding provider API key:
| Provider | Model string example | Env var |
|---|---|---|
| Anthropic (direct) | anthropic/claude-sonnet-4-6 |
ANTHROPIC_API_KEY |
| Google Gemini | gemini/gemini-2.5-flash, gemini/gemini-2.5-pro |
GEMINI_API_KEY |
| OpenRouter | openrouter/anthropic/claude-sonnet-4-6 |
OPENROUTER_API_KEY |
export GEMINI_API_KEY=...
docagent init --backend litellm --model gemini/gemini-2.5-pro
--backend litellm without --model exits with a multi-line hint listing the supported routing strings.
Quickstart
Installation registers a single docagent console entry point. The Typer app exposes three commands: init, update, and verify.
# Full pass: scan repo, build symbol index, generate all artifacts.
docagent init
# Incremental refresh of artifacts affected by recent changes.
docagent update
# Re-run the deterministic-first verifier against artifacts on disk.
docagent verify
Useful flags on init/update: --dry-run previews diffs without writing, --only <artifact_id> restricts the run to specific artifacts, and --backend litellm --model <provider/model> swaps the default Claude Agent SDK backend for LiteLLM.
Architecture
DocAgent is organized into orthogonal packages under docagent/:
docagent.cli— Typer-based entry point forinit/update/verify; selects a backend and constructs the orchestrator.docagent.artifacts— Each artifact (readme,api_reference,how_to_guides,agents_md,claude_md,llms_txt,python_docstrings) owns its ownplan → generate → verifycycle and registers into the DAG viaregister_v1_builtins.docagent.core— Orchestrator, scanner, diff/state tracking, and theBudgetTrackerdrive the DAG; CLI imports these directly.docagent.adapters+docagent.parser— Language adapters using libcst/jedi for Python and tree-sitter for Rust/Go/TypeScript/Java/C++ extract symbols and signatures.docagent.index— SQLite-backed symbol/mention store opened viaopen_store; populated by the scanner duringinit.docagent.backends— Pluggable LLM backends:agent_sdk(default, Claude Agent SDK) andlitellm(multi-provider;--modelrequired).docagent.verify— Deterministic-first verification pipeline backing theverifycommand and the per-artifact verify phase.
Status
Beta. The artifact pipeline runs end-to-end and self-verifies through the deterministic gates; the package version is 1.0.4 and APIs may still change between minor releases.
How we differ
This project shares the name "DocAgent" with Meta AI's facebookresearch/DocAgent (arXiv 2504.08725, ACL 2025 demo) but is a separate, unaffiliated project with a different scope:
| Axis | Meta's DocAgent | This project |
|---|---|---|
| Scope | Python docstrings only, symbol-level | Whole-repo artifacts: README, AGENTS.md, CLAUDE.md, how-to guides, API reference, llms.txt, plus docstrings |
| Topology | 5-agent pipeline (Navigator → Reader / Searcher / Writer / Verifier / Orchestrator) | Single-agent orchestrator over the Claude Agent SDK |
| Verification | LLM "Verifier" agent | Deterministic-first gate pipeline (ground-citation validation, markdownlint, structure, secrets) |
| Indexing | In-process dependency graph | Persisted SQLite symbol index (libcst + tree-sitter) |
| Grounding | Hierarchical context build | <!-- ground: path:line-start-line-end --> citation enforcement on every non-trivial claim |
| Distribution | Research repo, clone-and-run | pip install-able CLI: docagent init / update / verify |
If you arrived here looking for the Meta paper's reference implementation, you want facebookresearch/DocAgent.
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docagent_ai-1.0.4.tar.gz.
File metadata
- Download URL: docagent_ai-1.0.4.tar.gz
- Upload date:
- Size: 644.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
570b9dfc6a8dd0c77bd3496b2f02809b4cd09f19dbfc662d9d39abff06277b0d
|
|
| MD5 |
a1ef4f4cac73eeee9e4eba6a5c28a1fe
|
|
| BLAKE2b-256 |
2d7380b29f5e0d370f01bfdcbad4a375be062404d26c1b0f6334cb4c9f3ecbf1
|
File details
Details for the file docagent_ai-1.0.4-py3-none-any.whl.
File metadata
- Download URL: docagent_ai-1.0.4-py3-none-any.whl
- Upload date:
- Size: 113.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9dc2d38bdaad0f38071677ff73fd91bd1f1b096abb9decf15d83cd155ae2c1e3
|
|
| MD5 |
cbbb32f2048b39b55acdcdf4846186a6
|
|
| BLAKE2b-256 |
4978276eb6833b5241d20b788dbd2c11af60b55f0d6cc1285de8ffbafb29cd59
|