Fetch docs, embed locally, expose to AI agents via skills.
Project description
docmancer
A local knowledge base for AI agents. Ground your agents in version-specific docs and structured research vaults, locally, for free.
|
✔ Up-to-date, version-specific documentation straight from the source |
pipx install docmancer --python python3.13
Quickstart · Two Workflows · The Problem · Agents · Why Local? · Commands · Install · Wiki
Quickstart
# 1. Install pipx
brew install pipx
pipx ensurepath
# 2. Open a new shell, then install docmancer
pipx install docmancer --python python3.13
# 3. Create a knowledge vault
docmancer init --template vault --name my-research
# 4. Add sources from the web or local files
docmancer vault add-url https://some-article.com/post
# or place markdown files directly in raw/
# 5. Sync filesystem, manifest, and vector index
docmancer vault scan
# 6. Install the skill into your agents
docmancer install claude-code
docmancer install cursor
# 7. Query, navigate, and maintain
docmancer query "How does authentication work?"
docmancer vault search "auth flow"
docmancer vault suggest
No server to start. Config and the default vector store are created under ~/.docmancer/ on first use. Vaults are plain markdown on the filesystem, so they work natively with Obsidian for graph view, canvas, backlinks, and the full plugin ecosystem.
You can also adopt an existing folder of Markdown (such as an Obsidian vault) without reorganizing anything:
docmancer vault open ./my-obsidian-vault --name research
Two Workflows
Docmancer supports two primary workflows built on the same local-first retrieval stack.
Research vaults
The recommended way to use docmancer. A vault is a structured local knowledge base with filesystem layout (raw/, wiki/, outputs/), a provenance manifest, maintenance intelligence, and retrieval evals. You add sources from the web, local files, or PDFs, and docmancer handles indexing, linting, and maintenance guidance so your agents can navigate and build on the knowledge over time.
docmancer vault scan # reconcile state
docmancer vault context "transformer arch" # grouped research bundle
docmancer vault lint # check structural integrity
docmancer vault backlog # find coverage gaps
docmancer vault suggest # get next actions for agents
For full details, see the Vaults wiki page.
Quick docs retrieval
If you just need to ground your agents in a specific documentation site without setting up a full vault, the original ingest workflow still works. Point docmancer at a docs URL, ingest it, and query directly.
docmancer ingest https://docs.example.com
docmancer query "How do I authenticate?"
Both workflows coexist. They share the same embedding pipeline, vector store, and CLI skill system. Quick docs retrieval is a fast on-ramp, while vaults are the full experience for knowledge work that grows over time.
The Problem
AI agents hallucinate APIs. They invent CLI flags, fabricate method signatures, and confidently cite documentation from versions that no longer exist. The root cause is simple: their training data has a cutoff, and they fill gaps by guessing.
The obvious fix, dumping entire doc sites into context, makes it worse. You burn thousands of tokens on irrelevant text and bury the one paragraph that actually matters. The same problem applies to research and knowledge work: agents need structured, retrievable knowledge, not a raw pile of files.
Cloud-based documentation tools add rate limits, usage tiers, and route your queries through third-party servers. Docmancer takes a different approach: you ingest docs once (or build a structured vault from mixed sources), they are chunked and indexed locally, and the agent retrieves only the matching sections when it needs them.
Works With Every Agent
Docmancer installs a skill file into each agent that teaches it to call the CLI directly. One local index, one ingest step, every agent covered.
| Agent | Install command |
|---|---|
| Claude Code | docmancer install claude-code |
| Cline | docmancer install cline |
| Codex | docmancer install codex |
| Cursor | docmancer install cursor |
| Gemini CLI | docmancer install gemini |
| OpenCode | docmancer install opencode |
| Claude Desktop | docmancer install claude-desktop |
Skills are plain markdown files. No background daemon, no MCP server, no ports. Use --project with claude-code, gemini, or cline to install into the current working directory instead of globally.
Why Local?
| DocMancer | |
|---|---|
| Cost | Free, always. No tiers, no quotas. |
| Rate limits | None. Query as much as you want. |
| Private docs | Supported free. No paid plan required. |
| Data privacy | Nothing leaves your machine. |
| Infrastructure | No server. CLI + local storage. |
| Offline use | Yes, after ingestion. |
| Embedding | Local FastEmbed. No API key needed. |
Commands
Core
| Command | What it does |
|---|---|
docmancer ingest <url-or-path> |
Fetch, chunk, embed, and index docs locally |
docmancer query <text> |
Retrieve relevant chunks from the local index |
docmancer install <agent> |
Install skill file for a supported agent |
docmancer list |
List ingested sources with timestamps |
docmancer fetch <url> |
Download GitBook docs as markdown (no embedding) |
docmancer remove <source> |
Remove an ingested source from the index |
docmancer inspect |
Show collection stats and config |
docmancer doctor |
Health check: PATH, config, Qdrant, installed skills |
docmancer init |
Create a project-local docmancer.yaml |
docmancer setup |
Interactive wizard for API keys and integrations |
Vault
| Command | What it does |
|---|---|
docmancer init --template vault |
Scaffold a structured knowledge base with raw/, wiki/, outputs/ |
docmancer vault open <path> |
Adopt an existing folder of files as a vault |
docmancer vault scan |
Reconcile filesystem, manifest, and vector index |
docmancer vault status |
Show vault health summary with file counts and index states |
docmancer vault add-url <url> |
Fetch a web page into raw/ with provenance and index it |
docmancer vault inspect <id-or-path> |
Show manifest metadata for a specific vault entry |
docmancer vault search <query> |
Search vault metadata and content at file level |
docmancer vault context <query> |
Get grouped research context across raw, wiki, and output corpora |
docmancer vault related <id-or-path> |
Find entries related by tags, links, and semantic similarity |
docmancer vault lint |
Validate vault integrity; use --deep for LLM-assisted checks |
docmancer vault backlog |
Generate prioritized maintenance items from vault state |
docmancer vault suggest |
Produce a next-actions list for agents without writing content |
Evals
| Command | What it does |
|---|---|
docmancer query --trace |
Print a structured execution trace for a single retrieval |
docmancer dataset generate |
Generate a golden eval dataset scaffold; use --llm for LLM-assisted Q&A |
docmancer dataset generate-training |
Generate fine-tuning training data in JSONL, Alpaca, or conversation format |
docmancer eval |
Run retrieval metrics (MRR, hit rate, chunk overlap, latency) against a dataset |
Use --full with docmancer query to return the entire chunk body (default truncates at 1500 characters). Use --limit N to change how many chunks are returned.
For large ingests, tune ingestion.workers, ingestion.embed_queue_size, web_fetch.workers, embedding.batch_size, and embedding.parallel in docmancer.yaml.
Evals and Observability
Docmancer includes a local-first eval system so you can measure whether retrieval quality is actually improving as you add content and organize a vault.
- Query tracing (
--trace) shows a latency breakdown for each retrieval: embedding time, search time, and returned chunks with scores. - Dataset generation creates golden eval datasets from your content, either as a scaffold you fill in manually or with LLM-assisted Q&A generation (
--llm). - Deterministic metrics (MRR, hit rate, chunk overlap, latency percentiles) run entirely locally with no API keys required.
- LLM-as-judge (
eval --judge) adds semantic relevance scoring on top of the deterministic metrics for deeper analysis.
The eval system connects to the vault intelligence commands. For example, vault backlog can surface queries from the golden dataset that scored below threshold, pointing agents toward areas where the knowledge base needs better coverage.
For full details, see the Evals and Observability wiki page.
Cross-Vault Workflows
You can have separate vaults for different knowledge domains. Each vault has its own manifest and config, but they share the local Qdrant store by default. Tags let you organize vaults into logical groups and query across them.
# Create and tag vaults
docmancer init --template vault --name stripe-docs --dir ./vaults/stripe
docmancer vault tag stripe-docs work api
# List registered vaults, optionally filtered by tag
docmancer list --vaults --tag work
# Query across all vaults or a specific tag group
docmancer query --cross-vault "webhook retry behavior"
docmancer query --tag research "attention mechanisms"
Knowledge ingested in one agent context is queryable from any other agent on the same machine. Ingest in Claude Code, query from Cursor, and the results are the same because all agents hit the same local store.
For full details, see the Cross-Vault Workflows wiki page.
Install
brew install pipx
pipx ensurepath
# open a new shell, then:
pipx install docmancer --python python3.13
Supports Python 3.11-3.13. On Apple Silicon, prefer the native Homebrew Python:
pipx install docmancer --python /opt/homebrew/bin/python3.13
Upgrade with pipx upgrade docmancer.
Documentation
For configuration, troubleshooting, architecture details, and more, see the GitHub Wiki.
Contributing
See CONTRIBUTING.md.
License
MIT License. See LICENSE.
Your agents are guessing. Give them a knowledge base.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docmancer-0.2.2.tar.gz.
File metadata
- Download URL: docmancer-0.2.2.tar.gz
- Upload date:
- Size: 482.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
190bbfcd126c30763809d6e9ef09e8bf0e16951dda21fa0523cdcaf255968021
|
|
| MD5 |
d3a23c658f08b21b86187b42c4ef05b7
|
|
| BLAKE2b-256 |
052a4c4d1380bb43297a8bbc39589b00b8b151e1d78e01a277b1cd25839e4106
|
Provenance
The following attestation bundles were made for docmancer-0.2.2.tar.gz:
Publisher:
publish.yml on docmancer/docmancer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docmancer-0.2.2.tar.gz -
Subject digest:
190bbfcd126c30763809d6e9ef09e8bf0e16951dda21fa0523cdcaf255968021 - Sigstore transparency entry: 1254381778
- Sigstore integration time:
-
Permalink:
docmancer/docmancer@d243a317ed793a2a5fdd77e0a630ac58a1468e60 -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/docmancer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d243a317ed793a2a5fdd77e0a630ac58a1468e60 -
Trigger Event:
push
-
Statement type:
File details
Details for the file docmancer-0.2.2-py3-none-any.whl.
File metadata
- Download URL: docmancer-0.2.2-py3-none-any.whl
- Upload date:
- Size: 159.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
45925430cdc19204b40b3b8303e395ca380509409b2a0ca2b3e2a3994fcc2368
|
|
| MD5 |
6cce98c173dbd5832191e3624251170b
|
|
| BLAKE2b-256 |
4482579674cd31ef1aef8b46a9de7224e6b7ebad7c1110bf39e837960263c0a0
|
Provenance
The following attestation bundles were made for docmancer-0.2.2-py3-none-any.whl:
Publisher:
publish.yml on docmancer/docmancer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docmancer-0.2.2-py3-none-any.whl -
Subject digest:
45925430cdc19204b40b3b8303e395ca380509409b2a0ca2b3e2a3994fcc2368 - Sigstore transparency entry: 1254381865
- Sigstore integration time:
-
Permalink:
docmancer/docmancer@d243a317ed793a2a5fdd77e0a630ac58a1468e60 -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/docmancer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d243a317ed793a2a5fdd77e0a630ac58a1468e60 -
Trigger Event:
push
-
Statement type: