Fetch docs, embed locally, expose to AI agents via skills.

Project description

docmancer

A local knowledge base for AI agents. Ground your agents in version-specific docs and structured research vaults, locally, for free.

✔ Up-to-date, version-specific documentation straight from the source
✔ Research vaults for mixed-source knowledge work with Obsidian compatibility
✔ Only the chunks your agent needs, not the whole doc site
✔ Built-in evals to measure and improve retrieval quality
✔ 100% local. Embeddings, storage, retrieval all on your machine.
✔ Completely free. No rate limits, no quotas, no API keys.
✔ Works offline once ingested. Private and internal docs supported.
✔ No MCP server. Installs as a skill, runs as a CLI.

pipx install docmancer --python python3.13

Quickstart · Two Workflows · The Problem · Agents · Why Local? · Commands · Install · Wiki

Quickstart

# 1. Install pipx
brew install pipx
pipx ensurepath

# 2. Open a new shell, then install docmancer
pipx install docmancer --python python3.13

# 3. Create a knowledge vault
docmancer init --template vault --name my-research

# 4. Add sources from the web or local files
docmancer vault add-url https://some-article.com/post
# or place markdown files directly in raw/

# 5. Sync filesystem, manifest, and vector index
docmancer vault scan

# 6. Install the skill into your agents
docmancer install claude-code
docmancer install cursor

# 7. Query, navigate, and maintain
docmancer query "How does authentication work?"
docmancer vault search "auth flow"
docmancer vault suggest

No server to start. Config and the default vector store are created under ~/.docmancer/ on first use. Vaults are plain markdown on the filesystem, so they work natively with Obsidian for graph view, canvas, backlinks, and the full plugin ecosystem.

You can also adopt an existing folder of Markdown (such as an Obsidian vault) without reorganizing anything:

docmancer vault open ./my-obsidian-vault --name research

Two Workflows

Docmancer supports two primary workflows built on the same local-first retrieval stack.

Research vaults

The recommended way to use docmancer. A vault is a structured local knowledge base with filesystem layout (raw/, wiki/, outputs/), a provenance manifest, maintenance intelligence, and retrieval evals. You add sources from the web, local files, or PDFs, and docmancer handles indexing, linting, and maintenance guidance so your agents can navigate and build on the knowledge over time.

docmancer vault scan                         # reconcile state
docmancer vault context "transformer arch"   # grouped research bundle
docmancer vault lint                         # check structural integrity
docmancer vault backlog                      # find coverage gaps
docmancer vault suggest                      # get next actions for agents

For full details, see the Vaults wiki page.

Quick docs retrieval

If you just need to ground your agents in a specific documentation site without setting up a full vault, the original ingest workflow still works. Point docmancer at a docs URL, ingest it, and query directly.

docmancer ingest https://docs.example.com
docmancer query "How do I authenticate?"

Both workflows coexist. They share the same embedding pipeline, vector store, and CLI skill system. Quick docs retrieval is a fast on-ramp, while vaults are the full experience for knowledge work that grows over time.

The Problem

AI agents hallucinate APIs. They invent CLI flags, fabricate method signatures, and confidently cite documentation from versions that no longer exist. The root cause is simple: their training data has a cutoff, and they fill gaps by guessing.

The obvious fix, dumping entire doc sites into context, makes it worse. You burn thousands of tokens on irrelevant text and bury the one paragraph that actually matters. The same problem applies to research and knowledge work: agents need structured, retrievable knowledge, not a raw pile of files.

Cloud-based documentation tools add rate limits, usage tiers, and route your queries through third-party servers. Docmancer takes a different approach: you ingest docs once (or build a structured vault from mixed sources), they are chunked and indexed locally, and the agent retrieves only the matching sections when it needs them.

Works With Every Agent

Docmancer installs a skill file into each agent that teaches it to call the CLI directly. One local index, one ingest step, every agent covered.

Agent	Install command
Claude Code	`docmancer install claude-code`
Cline	`docmancer install cline`
Codex	`docmancer install codex`
Cursor	`docmancer install cursor`
Gemini CLI	`docmancer install gemini`
OpenCode	`docmancer install opencode`
Claude Desktop	`docmancer install claude-desktop`

Skills are plain markdown files. No background daemon, no MCP server, no ports. Use --project with claude-code, gemini, or cline to install into the current working directory instead of globally.

Why Local?

	DocMancer
Cost	Free, always. No tiers, no quotas.
Rate limits	None. Query as much as you want.
Private docs	Supported free. No paid plan required.
Data privacy	Nothing leaves your machine.
Infrastructure	No server. CLI + local storage.
Offline use	Yes, after ingestion.
Embedding	Local FastEmbed. No API key needed.

Commands

Core

Command	What it does
`docmancer ingest <url-or-path>`	Fetch, chunk, embed, and index docs locally
`docmancer query <text>`	Retrieve relevant chunks from the local index
`docmancer install <agent>`	Install skill file for a supported agent
`docmancer list`	List ingested sources with timestamps
`docmancer fetch <url>`	Download GitBook docs as markdown (no embedding)
`docmancer remove <source>`	Remove an ingested source from the index
`docmancer inspect`	Show collection stats and config
`docmancer doctor`	Health check: PATH, config, Qdrant, installed skills
`docmancer init`	Create a project-local `docmancer.yaml`
`docmancer setup`	Interactive wizard for API keys and integrations

Vault

Command	What it does
`docmancer init --template vault`	Scaffold a structured knowledge base with `raw/`, `wiki/`, `outputs/`
`docmancer vault open <path>`	Adopt an existing folder of files as a vault
`docmancer vault scan`	Reconcile filesystem, manifest, and vector index
`docmancer vault status`	Show vault health summary with file counts and index states
`docmancer vault add-url <url>`	Fetch a web page into `raw/` with provenance and index it
`docmancer vault inspect <id-or-path>`	Show manifest metadata for a specific vault entry
`docmancer vault search <query>`	Search vault metadata and content at file level
`docmancer vault context <query>`	Get grouped research context across raw, wiki, and output corpora
`docmancer vault related <id-or-path>`	Find entries related by tags, links, and semantic similarity
`docmancer vault lint`	Validate vault integrity; use `--deep` for LLM-assisted checks
`docmancer vault backlog`	Generate prioritized maintenance items from vault state
`docmancer vault suggest`	Produce a next-actions list for agents without writing content

Evals

Command	What it does
`docmancer query --trace`	Print a structured execution trace for a single retrieval
`docmancer dataset generate`	Generate a golden eval dataset scaffold; use `--llm` for LLM-assisted Q&A
`docmancer dataset generate-training`	Generate fine-tuning training data in JSONL, Alpaca, or conversation format
`docmancer eval`	Run retrieval metrics (MRR, hit rate, chunk overlap, latency) against a dataset

Use --full with docmancer query to return the entire chunk body (default truncates at 1500 characters). Use --limit N to change how many chunks are returned.

For large ingests, tune ingestion.workers, ingestion.embed_queue_size, web_fetch.workers, embedding.batch_size, and embedding.parallel in docmancer.yaml.

Evals and Observability

Docmancer includes a local-first eval system so you can measure whether retrieval quality is actually improving as you add content and organize a vault.

Query tracing (--trace) shows a latency breakdown for each retrieval: embedding time, search time, and returned chunks with scores.
Dataset generation creates golden eval datasets from your content, either as a scaffold you fill in manually or with LLM-assisted Q&A generation (--llm).
Deterministic metrics (MRR, hit rate, chunk overlap, latency percentiles) run entirely locally with no API keys required.
LLM-as-judge (eval --judge) adds semantic relevance scoring on top of the deterministic metrics for deeper analysis.

The eval system connects to the vault intelligence commands. For example, vault backlog can surface queries from the golden dataset that scored below threshold, pointing agents toward areas where the knowledge base needs better coverage.

For full details, see the Evals and Observability wiki page.

Cross-Vault Workflows

You can have separate vaults for different knowledge domains. Each vault has its own manifest and config, but they share the local Qdrant store by default. Tags let you organize vaults into logical groups and query across them.

# Create and tag vaults
docmancer init --template vault --name stripe-docs --dir ./vaults/stripe
docmancer vault tag stripe-docs work api

# List registered vaults, optionally filtered by tag
docmancer list --vaults --tag work

# Query across all vaults or a specific tag group
docmancer query --cross-vault "webhook retry behavior"
docmancer query --tag research "attention mechanisms"

Knowledge ingested in one agent context is queryable from any other agent on the same machine. Ingest in Claude Code, query from Cursor, and the results are the same because all agents hit the same local store.

For full details, see the Cross-Vault Workflows wiki page.

Install

brew install pipx
pipx ensurepath
# open a new shell, then:
pipx install docmancer --python python3.13

Supports Python 3.11-3.13. On Apple Silicon, prefer the native Homebrew Python:

pipx install docmancer --python /opt/homebrew/bin/python3.13

Upgrade with pipx upgrade docmancer.

Documentation

For configuration, troubleshooting, architecture details, and more, see the GitHub Wiki.

Contributing

See CONTRIBUTING.md.

License

MIT License. See LICENSE.

Your agents are guessing. Give them a knowledge base.

Project details

Release history Release notifications | RSS feed

0.4.9

Apr 28, 2026

0.4.8

Apr 28, 2026

0.4.7

Apr 28, 2026

0.4.6

Apr 27, 2026

0.4.5

Apr 21, 2026

0.4.4

Apr 21, 2026

0.4.3

Apr 21, 2026

0.4.2

Apr 21, 2026

0.4.1

Apr 21, 2026

0.4.0

Apr 21, 2026

0.3.4

Apr 15, 2026

0.3.3

Apr 15, 2026

0.3.2

Apr 14, 2026

0.3.1

Apr 14, 2026

0.3.0

Apr 12, 2026

0.2.3

Apr 10, 2026

This version

0.2.2

Apr 8, 2026

0.2.1

Apr 7, 2026

0.2.0

Apr 7, 2026

0.1.11

Apr 3, 2026

0.1.9

Apr 1, 2026

0.1.8

Apr 1, 2026

0.1.7

Apr 1, 2026

0.1.6

Mar 31, 2026

0.1.5

Mar 30, 2026

0.1.4

Mar 30, 2026

0.1.3

Mar 30, 2026

0.1.2

Mar 30, 2026

0.1.1

Mar 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docmancer-0.2.2.tar.gz (482.8 kB view details)

Uploaded Apr 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

docmancer-0.2.2-py3-none-any.whl (159.1 kB view details)

Uploaded Apr 8, 2026 Python 3

File details

Details for the file docmancer-0.2.2.tar.gz.

File metadata

Download URL: docmancer-0.2.2.tar.gz
Upload date: Apr 8, 2026
Size: 482.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for docmancer-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`190bbfcd126c30763809d6e9ef09e8bf0e16951dda21fa0523cdcaf255968021`
MD5	`d3a23c658f08b21b86187b42c4ef05b7`
BLAKE2b-256	`052a4c4d1380bb43297a8bbc39589b00b8b151e1d78e01a277b1cd25839e4106`

See more details on using hashes here.

Provenance

The following attestation bundles were made for docmancer-0.2.2.tar.gz:

Publisher: publish.yml on docmancer/docmancer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: docmancer-0.2.2.tar.gz
- Subject digest: 190bbfcd126c30763809d6e9ef09e8bf0e16951dda21fa0523cdcaf255968021
- Sigstore transparency entry: 1254381778
- Sigstore integration time: Apr 8, 2026
Source repository:
- Permalink: docmancer/docmancer@d243a317ed793a2a5fdd77e0a630ac58a1468e60
- Branch / Tag: refs/tags/v0.2.2
- Owner: https://github.com/docmancer
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d243a317ed793a2a5fdd77e0a630ac58a1468e60
- Trigger Event: push

File details

Details for the file docmancer-0.2.2-py3-none-any.whl.

File metadata

Download URL: docmancer-0.2.2-py3-none-any.whl
Upload date: Apr 8, 2026
Size: 159.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for docmancer-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`45925430cdc19204b40b3b8303e395ca380509409b2a0ca2b3e2a3994fcc2368`
MD5	`6cce98c173dbd5832191e3624251170b`
BLAKE2b-256	`4482579674cd31ef1aef8b46a9de7224e6b7ebad7c1110bf39e837960263c0a0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for docmancer-0.2.2-py3-none-any.whl:

Publisher: publish.yml on docmancer/docmancer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: docmancer-0.2.2-py3-none-any.whl
- Subject digest: 45925430cdc19204b40b3b8303e395ca380509409b2a0ca2b3e2a3994fcc2368
- Sigstore transparency entry: 1254381865
- Sigstore integration time: Apr 8, 2026
Source repository:
- Permalink: docmancer/docmancer@d243a317ed793a2a5fdd77e0a630ac58a1468e60
- Branch / Tag: refs/tags/v0.2.2
- Owner: https://github.com/docmancer
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d243a317ed793a2a5fdd77e0a630ac58a1468e60
- Trigger Event: push

docmancer 0.2.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

docmancer

Quickstart

Two Workflows

Research vaults

Quick docs retrieval

The Problem

Works With Every Agent

Why Local?

Commands

Core

Vault

Evals

Evals and Observability

Cross-Vault Workflows

Install

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance