Local-first retrieval engine for technical equipment documentation.
Project description
Cogwright
Cogwright answers questions about technical equipment documentation. Point it at a folder of manuals, service bulletins, and parts lists, ask in plain language, and get a step-by-step answer grounded only in those documents, with citations to the source page. A bare alarm code, stop code, or part number resolves to the exact passage that defines it.
It runs locally against a model endpoint you choose. Your documents and the index stay on your machine, and the only network calls go to that one endpoint.
uv sync
uv run cogwright ingest ./manuals \
--base-url http://localhost:8000/v1 --embedding-model nomic-embed-text
uv run cogwright ask "How do I clear alarm 204?" \
--base-url http://localhost:8000/v1 \
--llm-model llama3.2:1b --embedding-model nomic-embed-text --min-score 0.55
To clear alarm 204, follow these steps:
1. Stop the unit and allow the gearbox to cool for ten minutes.
2. Check the coolant level and refill to the cold mark if it is low.
3. Clear alarm 204 from the panel and restart the unit.
Referenced identifiers: AL-204
Sources:
- manuals/series7_conveyor_manual.txt (page 3, section: ALARM AND STOP CODE REFERENCE)
When the corpus does not contain the answer, Cogwright says so instead of inventing one.
How it works
Ingest parses each document into structure-aware chunks, where a table stays whole and a numbered procedure stays together, embeds them, and writes a single JSON index to disk. A dedicated pass indexes every alarm, stop, fault, error, diagnostic, warning, and part identifier as an exact lookup key.
Ask embeds the question and retrieves by two signals at once: semantic similarity and exact identifier lookup, with exact matches ranked above fuzzy ones. Those passages, and only those, go to the model with instructions to answer from them, number any steps, surface the identifiers, and cite each passage. If nothing clears the relevance bar, the model is never called.
ingest: documents -> parse -> chunk -> embed ------+
\--> identifier index ----+--> index.json
ask: question -> embed + detect identifiers -> hybrid retrieval
-> prompt (retrieved context only) -> model -> cited answer
Install
Cogwright targets Python 3.12. It is published on PyPI as cogwright-rag (the
import package and the cogwright command keep the shorter name):
pip install cogwright-rag # core install
pip install "cogwright-rag[ocr]" # add scanned-page recognition
For development, use uv:
uv sync # core install
uv sync --extra ocr # add scanned-page recognition
The only third-party runtime dependency is a PDF toolkit. The model client, vector math, and CLI are all standard library.
Configure the endpoint
Cogwright talks to any OpenAI-compatible endpoint, using the routes
/v1/chat/completions and /v1/embeddings. That can be a model server on the
same machine or a hosted API; the implementation is not tied to one provider, and
it has been validated against a local server running a small chat model and an
embedding model. Configure it with flags or environment variables:
| Setting | Flag | Environment variable | Default |
|---|---|---|---|
| Base URL | --base-url |
COGWRIGHT_BASE_URL |
http://localhost:8000/v1 |
| API key | --api-key |
COGWRIGHT_API_KEY |
none |
| Chat model | --llm-model |
COGWRIGHT_LLM_MODEL |
local-chat-model |
| Embedding model | --embedding-model |
COGWRIGHT_EMBEDDING_MODEL |
local-embedding-model |
| Vision model | --vision-model |
COGWRIGHT_VISION_MODEL |
local-vision-model |
| Index path | --index |
COGWRIGHT_INDEX |
.cogwright/index.json |
The model names are placeholders; set them to whatever your endpoint serves. An unreachable endpoint is reported with a non-zero exit, never a crash.
Commands
ingest
cogwright ingest <paths...> [--ocr] [--diagrams]
Paths are files or folders. --ocr recognizes scanned PDF pages and needs the
ocr extra. --diagrams transcribes figure callouts with a multimodal model set
by --vision-model. ingest builds a fresh index, recording which documents it
holds, when it was built, and the embedding model that produced the vectors.
update, remove, info
Maintain an index without rebuilding it from scratch:
cogwright update <paths...> # add new documents, refresh changed ones, skip unchanged
cogwright remove <paths...> # drop documents, matched by path or file name
cogwright info # show the documents, counts, model, and timestamps
update compares a content hash per document, so re-running it only re-embeds
what actually changed. Updating with a different embedding model than the index
was built with is refused, and a query run with a mismatched model warns, because
vectors from different models are not comparable.
ask
cogwright ask "<question>" [--top-k N] [--min-score S] [--no-stream] \
[--show-retrieved] [--json]
The answer streams as it is generated. --show-retrieved prints the ranked
passages and their scores first, which is how you see what retrieval is doing.
--json asks the model for a structured reply (steps as a list, the passages it
used named explicitly), which gives reliable numbered steps and precise
citations with a capable model and falls back to the prose path otherwise.
eval
cogwright eval <dataset.json> [--min-score S]
Scores retrieval against a graded dataset without calling the chat model. See calibrating relevance.
Documents it understands
- Text and Markdown (
.txt,.text,.md,.markdown). Headings, numbered steps, and pipe tables are recovered. - Born-digital PDFs with a real text layer, including tables, which are lifted out as structured blocks. Real page numbers are kept for citations.
- Scanned PDF pages, with the
ocrextra. A page with little text and a dominant image is rendered and recognized, then structured like any other page. - Diagram callouts, with
--diagrams. A figure is sent to a vision model and the printed labels and part numbers become searchable.
Identifier lookup
A query that is a bare code resolves to the exact passage that documents it. The built-in patterns detect and normalize:
| You type | Resolves to |
|---|---|
alarm 204, AL-204, AL204 |
AL-204 |
STOP CODE 12, SC-12 |
SC-12 |
fault 09 |
F-09 |
error 30 |
E-30 |
DTC P0420 |
DTC-P0420 |
warning 18 |
W-18 |
PN 44-19A, P/N 44-19A |
PN-44-19A |
The patterns are configuration rather than hardcoded, so a deployment can add its own identifier schemes.
Calibrating relevance
The not-found decision rests on one cosine cutoff, --min-score. It is
embedding-model dependent: different models place related and unrelated text at
different similarity ranges, so there is no universal value (the default, 0.45,
suits typical normalized models). Calibrate it with eval. A dataset pairs each
question with the pages it should surface, the identifiers it should resolve, and
whether it is answerable at all:
{ "question": "How do I clear alarm 204?", "expected_pages": [3],
"expected_codes": ["AL-204"], "should_find": true }
Raise --min-score until the unanswerable cases report not-found while the real
ones still resolve, then pass that value to both eval and ask. The harness
reports found accuracy, page hit rate, code-resolution accuracy, and not-found
accuracy, each with its counts.
Architecture
A pure core holds all retrieval and decision logic; a thin adapter and CLI layer does the real input and output. The core depends only on protocols, never on a concrete model, store, or framework:
- core seams:
FileSystem,DocumentParser,Embedder,LLMClient,VectorStore - ingestion seams:
OcrEngine,DiagramAnalyzer
Adapters supply the real implementations (disk, text and PDF parsers, the HTTP client, an in-memory cosine store, an OCR engine, a vision analyzer); the tests supply fakes. To add a vector database, a different model API, or a new document type, implement the seam and the core does not change.
Testing
uv run pytest # tests, including end-to-end ingest and ask on a sample manual
uv run mypy # strict type checking
uv run ruff check . # lint
Every seam has a deterministic fake, so the suite needs no live model. Tests for the real OCR engine run only where it is installed and skip otherwise. All three checks run in CI on every push and pull request, alongside a guard that fails on a missing license header.
Privacy
No telemetry and no required cloud service. The only outbound traffic is to the endpoint you configure, which can run on the same machine, so a fully air-gapped deployment is possible. Documents stay where you put them, and the index is a plain file you control.
License
MIT, with an SPDX header on every source file. Every dependency is permissively licensed; copyleft toolkits are avoided, including in the PDF path, to keep the tree MIT-compatible. See CONTRIBUTING.md to work on it.
Status
Text and born-digital PDF ingestion, hybrid retrieval, and grounded cited answers are complete and validated against a live local model. Scanned-page OCR, diagram transcription, and the evaluation harness have since landed. Region-level diagram cropping, tuning for low-quality scans, and more corpus formats are future work, and each fits an existing seam.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cogwright_rag-0.1.0.tar.gz.
File metadata
- Download URL: cogwright_rag-0.1.0.tar.gz
- Upload date:
- Size: 114.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c835b72452c29488a44055c2c558f025a6097d68ed8e05e56a0ad67d337acf23
|
|
| MD5 |
84c53a7004467ce6de145ead52e72be3
|
|
| BLAKE2b-256 |
6f161b5374c02ed8c47f5425fbe50033410929c54a1e15af710374763f693458
|
File details
Details for the file cogwright_rag-0.1.0-py3-none-any.whl.
File metadata
- Download URL: cogwright_rag-0.1.0-py3-none-any.whl
- Upload date:
- Size: 54.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.25 {"installer":{"name":"uv","version":"0.11.25","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3594e5d39d2b1838869f0cf267592bcc627747e856eef5b3b4209402493cdf65
|
|
| MD5 |
bd7c94be34a21494c7e3834d136fb45e
|
|
| BLAKE2b-256 |
a179d651bc512a5ea9d0b60d786fbd9df5069dc032da1a3b3c42bedef67babd4
|