Compress local documentation context for coding agents.

These details have not been verified by PyPI

Project description

docmancer

Compress documentation context so coding agents spend tokens on code, not docs.

Get Started | What It Does | Bench | Supported Agents | Docs

Docmancer fetches documentation, normalizes it into inspectable sections, indexes those sections with SQLite FTS5, and returns compact context packs with source attribution. The goal is agentic runway: your agent should burn tokens on implementation, tests, and debugging, not on rereading entire documentation sites.

Product shape: an MIT-licensed CLI on PyPI. You point it at a docs URL or local path with add, it indexes sections into a local SQLite database, and your coding agent calls docmancer query through an installed skill. There is no hosted query API, no servers, and no API keys on the core path. An optional benchmarking harness (docmancer bench) compares retrieval backends (SQLite FTS, Qdrant vector, RLM) on your own corpus.

In a typical agentic coding session, raw docs pages can consume 30 to 40 percent of the context window. Docmancer compresses that overhead by 60 to 90 percent, so the agent stays sharp longer, runs more iterations before context degradation, and produces more output per session.

Quickstart

pipx install docmancer --python python3.13

docmancer setup
docmancer add https://bun.com/docs
docmancer query "How do I use fixtures?"

setup creates ~/.docmancer/docmancer.yaml, initializes ~/.docmancer/docmancer.db, and installs detected agent skills. Use setup --all for non-interactive installation across all supported agents.

What It Does

Fetch docs from URLs, GitHub repos, or local paths and index them locally with SQLite FTS5.
No vector database, no embedding model downloads, and no external API calls on the core path.
Stores normalized sections in SQLite and writes extracted markdown/json files under .docmancer/extracted/ for inspection.
Supports GitBook, Mintlify, generic web crawl, GitHub markdown, local directories, and plain text/markdown files.
Returns compact context packs with estimated token savings and source attribution.
Optional benchmarking: docmancer bench compares FTS, Qdrant vector, and RLM retrieval backends on the same dataset with reproducible artifacts.

Benchmark retrieval backends

docmancer bench is a local harness for comparing retrieval backends on your own docs. FTS ships in the core install; Qdrant and RLM are experimental and behind optional extras.

Zero-config benchmark (recommended for first run)

The fastest way to see docmancer bench work end to end is the built-in Lenny dataset: 30 hand-authored questions grounded in Lenny Rachitsky's public newsletter and podcast starter pack. The corpus (about 24 MB) is fetched on first use from LennysNewsletter/lennys-newsletterpodcastdata and cached under ~/.docmancer/bench/corpora/lenny/. Subsequent invocations reuse the cache and make zero network calls; pass --refresh if you ever want to pull an updated copy. The corpus is licensed by Lenny Rachitsky for personal, non-commercial use; you accept that license interactively on first fetch.

docmancer bench init
docmancer bench dataset use lenny
docmancer bench run --backend fts --dataset lenny --run-id lenny_fts
docmancer bench report lenny_fts

Benchmarking your own docs with LLM-generated questions

Point bench dataset create at any folder of markdown and docmancer asks an LLM to produce grounded questions with expected answers, source files, and a mix of easy, medium, and hard difficulties.

docmancer bench dataset create \
  --from-corpus ./my-docs --size 30 --name mydocs --provider auto
docmancer bench run --backend fts --dataset mydocs --run-id mydocs_fts

--provider auto picks the first configured provider in the order Anthropic, OpenAI, Gemini, Ollama. Supported providers and the env vars they use:

Provider	Env var	Install
Anthropic	`ANTHROPIC_API_KEY`	`pipx inject docmancer 'docmancer[llm]'`
OpenAI	`OPENAI_API_KEY`	`pipx inject docmancer 'docmancer[llm]'`
Gemini	`GEMINI_API_KEY`	`pipx inject docmancer 'docmancer[llm]'`
Ollama	(none; `OLLAMA_HOST` optional)	`ollama serve` locally

Pass --provider heuristic for a no-key shallow fallback that derives one question per markdown heading. Running with --provider auto and no key set exits with an actionable setup message rather than silently producing shallow questions.

Running and comparing

# Optional experimental backends. Install the extras up front so pipx
# records them for the docmancer app. The RLM extra depends on `rlms`.
pipx install 'docmancer[vector,rlm,judge]' --python python3.13

docmancer bench run --backend qdrant --dataset lenny --run-id lenny_qdrant
docmancer bench run --backend rlm    --dataset lenny --run-id lenny_rlm
docmancer bench compare lenny_fts lenny_qdrant lenny_rlm
docmancer bench list

Every run writes config.snapshot.yaml, retrievals.jsonl, answers.jsonl, metrics.json, and report.md under .docmancer/bench/runs/<run_id>/. A content-hashed ingest_hash guards against comparing runs across drifted corpora. All backends see the same canonical section chunks so metrics are apples-to-apples. See wiki/Commands.md for the full command list and wiki/Configuration.md for tunables.

Legacy .docmancer/eval_dataset.json files are accepted read-only; convert them with docmancer bench dataset create --from-legacy <path>.

Commands

Command	What it does
`docmancer setup`	Create config/database and install detected agent skills
`docmancer setup --all`	Non-interactively install all supported agent integrations
`docmancer add <url-or-path>`	Fetch or read documentation and index normalized sections
`docmancer update`	Re-fetch and re-index all existing docs sources
`docmancer query <text>`	Return a compact markdown context pack
`docmancer query <text> --format json`	Return the same context pack as JSON
`docmancer query <text> --expand`	Include adjacent sections around matches
`docmancer query <text> --expand page`	Include the full matching page, subject to the token budget
`docmancer list`	List indexed docsets or sources
`docmancer inspect`	Show SQLite index stats and extract locations
`docmancer remove <source>`	Remove a source or docset root
`docmancer remove --all`	Remove everything indexed (keeps the config)
`docmancer doctor`	Check config, SQLite FTS5, index stats, and agent skill installs
`docmancer fetch <url> --output <dir>`	Download docs to markdown files without indexing
`docmancer init`	Create a project-local `docmancer.yaml`
`docmancer install <agent>`	Manual skill installation for a single agent
`docmancer bench ...`	Benchmarking harness (see the section above)

Retrieval Shape

By default, query uses a 2400 token budget and returns markdown with a summary like:

Context pack: ~900 tokens vs ~4800 raw docs tokens (81.2% less docs overhead, 5.33x agentic runway)

The savings are estimates, but the direction is explicit: compress docs overhead so the remaining token budget goes into useful agent work.

Workflow

# 1. Add the docs your agent should see
docmancer add https://docs.pytest.org
docmancer add ./docs

# 2. Install a skill into your agent
docmancer install claude-code

# 3. Query from the CLI or from the agent
docmancer query "How do I use fixtures?"

All agents you install share the same local SQLite index.

Keeping Docs Up To Date

Run docmancer update to refresh all locally-added sources. Docmancer re-fetches each URL or re-reads each local path and updates the index in place.

Project-Local Config

Global config is stored under ~/.docmancer/ by default. To use a project-local index:

docmancer init
docmancer add ./docs

The generated docmancer.yaml points to .docmancer/docmancer.db and .docmancer/extracted inside the project. If no project config exists, docmancer falls back to the global config.

A bench: block can override bench paths and defaults:

index:
  db_path: .docmancer/docmancer.db
  extracted_dir: .docmancer/extracted/

bench:
  datasets_dir: .docmancer/bench/datasets
  runs_dir: .docmancer/bench/runs
  backends:
    k_retrieve: 10
    k_answer: 5

Legacy eval: blocks are translated automatically with a deprecation warning.

Supported Agents

setup detects common agent installations. Manual installation remains available:

docmancer install claude-code
docmancer install claude-desktop
docmancer install codex
docmancer install cursor
docmancer install cline
docmancer install gemini
docmancer install github-copilot
docmancer install opencode

Claude Desktop receives a zip package that can be uploaded through Claude Desktop's Skills UI.

Optional Extras

Extra	Enables
`docmancer[browser]`	Playwright-backed fetcher for JS-heavy sites
`docmancer[crawl4ai]`	Alternative fetcher for hard-to-scrape sites
`docmancer[vector]`	Qdrant vector backend for `docmancer bench`
`docmancer[rlm]`	RLM backend for `docmancer bench` (`rlms`)
`docmancer[judge]`	LLM-as-judge answer scoring via ragas
`docmancer[llm]`	LLM-powered question generation for `bench dataset create` (Anthropic, OpenAI, Gemini)
`docmancer[ragas]`	Deprecated alias for `[judge]`; will be removed in the next minor

Fresh install with extras (recommended):

pipx install 'docmancer[vector,rlm,judge]' --python python3.13

The rlm extra resolves to the PyPI distribution rlms, which imports as rlm at runtime.

Note: if docmancer is already installed via pipx, the command above silently no-ops (pipx prints "already seems to be installed" and does not re-evaluate extras). In that case, use the Adding extras to an existing pipx install block below.

Adding extras to an existing pipx install (pipx won't re-read extras on a second pipx install; inject the deps into the existing venv instead):

pipx inject docmancer 'qdrant-client>=1.7.0' 'fastembed>=0.2.0'   # [vector]
pipx inject docmancer 'rlms>=0.1.0'                               # [rlm]
pipx inject docmancer 'ragas>=0.2.0'                              # [judge]

Or reinstall with pipx install 'docmancer[...]' --force --python python3.13. Plain pip users can install any combination directly: pip install 'docmancer[vector,rlm,judge]'.

Quickstart | Wiki | PyPI | Changelog

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.9

Apr 28, 2026

0.4.8

Apr 28, 2026

0.4.7

Apr 28, 2026

0.4.6

Apr 27, 2026

0.4.5

Apr 21, 2026

0.4.4

Apr 21, 2026

0.4.3

Apr 21, 2026

This version

0.4.2

Apr 21, 2026

0.4.1

Apr 21, 2026

0.4.0

Apr 21, 2026

0.3.4

Apr 15, 2026

0.3.3

Apr 15, 2026

0.3.2

Apr 14, 2026

0.3.1

Apr 14, 2026

0.3.0

Apr 12, 2026

0.2.3

Apr 10, 2026

0.2.2

Apr 8, 2026

0.2.1

Apr 7, 2026

0.2.0

Apr 7, 2026

0.1.11

Apr 3, 2026

0.1.9

Apr 1, 2026

0.1.8

Apr 1, 2026

0.1.7

Apr 1, 2026

0.1.6

Mar 31, 2026

0.1.5

Mar 30, 2026

0.1.4

Mar 30, 2026

0.1.3

Mar 30, 2026

0.1.2

Mar 30, 2026

0.1.1

Mar 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docmancer-0.4.2.tar.gz (2.5 MB view details)

Uploaded Apr 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

docmancer-0.4.2-py3-none-any.whl (131.6 kB view details)

Uploaded Apr 21, 2026 Python 3

File details

Details for the file docmancer-0.4.2.tar.gz.

File metadata

Download URL: docmancer-0.4.2.tar.gz
Upload date: Apr 21, 2026
Size: 2.5 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for docmancer-0.4.2.tar.gz
Algorithm	Hash digest
SHA256	`1554cb4e60e58ab9162a7af4623e972512f8bb4038d7be67a2ae16c77a19c792`
MD5	`56b037c34d9b82279616a4fe6d18690e`
BLAKE2b-256	`ade6960035ba4f4ed1335902a000007c415a9bd3af68cf16c7de1df81304d2da`

See more details on using hashes here.

Provenance

The following attestation bundles were made for docmancer-0.4.2.tar.gz:

Publisher: publish.yml on docmancer/docmancer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: docmancer-0.4.2.tar.gz
- Subject digest: 1554cb4e60e58ab9162a7af4623e972512f8bb4038d7be67a2ae16c77a19c792
- Sigstore transparency entry: 1348318680
- Sigstore integration time: Apr 21, 2026
Source repository:
- Permalink: docmancer/docmancer@aa4336caed64f138e2831b47babc9af7c79ab862
- Branch / Tag: refs/tags/v0.4.2
- Owner: https://github.com/docmancer
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@aa4336caed64f138e2831b47babc9af7c79ab862
- Trigger Event: push

File details

Details for the file docmancer-0.4.2-py3-none-any.whl.

File metadata

Download URL: docmancer-0.4.2-py3-none-any.whl
Upload date: Apr 21, 2026
Size: 131.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for docmancer-0.4.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cda72b9b9e5f49e57a1e7cb158bb14e13175342bea39dbecf4d64d6c7b5f8a28`
MD5	`805eaed8b770b52fef28b71f43c51763`
BLAKE2b-256	`ac7bf6f4281b1cecc8ee34aaff561f8fb4edc289f52ddda7a45f991b891b5276`

See more details on using hashes here.

Provenance

The following attestation bundles were made for docmancer-0.4.2-py3-none-any.whl:

Publisher: publish.yml on docmancer/docmancer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: docmancer-0.4.2-py3-none-any.whl
- Subject digest: cda72b9b9e5f49e57a1e7cb158bb14e13175342bea39dbecf4d64d6c7b5f8a28
- Sigstore transparency entry: 1348318761
- Sigstore integration time: Apr 21, 2026
Source repository:
- Permalink: docmancer/docmancer@aa4336caed64f138e2831b47babc9af7c79ab862
- Branch / Tag: refs/tags/v0.4.2
- Owner: https://github.com/docmancer
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@aa4336caed64f138e2831b47babc9af7c79ab862
- Trigger Event: push

docmancer 0.4.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

docmancer

Quickstart

What It Does

Benchmark retrieval backends

Zero-config benchmark (recommended for first run)

Benchmarking your own docs with LLM-generated questions

Running and comparing

Commands

Retrieval Shape

Workflow

Keeping Docs Up To Date

Project-Local Config

Supported Agents

Optional Extras

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance