Local multimodal semantic search CLI for technical documentation (text, PDFs, images) using ColQwen3 late-interaction embeddings.

Project description

Ragrag — local multimodal RAG for documents

RTFM for AI

Local multimodal semantic search using ColQwen3 late-interaction embeddings + Qdrant MaxSim retrieval. Indexes text files, PDFs with images/diagrams, and standalone images.

Ragrag is originally designed to allow AI agents to read complex technical documentation when doing embedded development, where simple text-based indexing won't work due to abundance of diragrams, schematics, and complex tabular data.

Usage

Install:

pip install -e .  # TODO: upgrade to `pip install ragrag` when published.

⚠️ The very first run will take a very long time because the tool will download the model from Huggingface. Despite downloading from the internet, indexing and search run 100% locally.

Search all supported documents in the current directory and subdirectories that are related to clock tree configuration:

ragrag "clock tree configuration"

When a new file is found or an existing file is changed, the model will automatically re-index it (no need to tell it to index manually), which may take anywhere from a few seconds to who knows how long depending on the documents and the performance of your computer (everything is done locally).

The index is stored in .ragrag/; the tool will attempt to locate an existing index in the current or parent directories. If none is found, it will attempt to guess where to create the index; if it cannot guess reliably, it will ask you to confirm using --new.

The tool may log in stderr, while the search results go to stdout.

Search specific directories with more results and with Markdown output:

ragrag "GPIO initialization" --top-k 20 --markdown

For more options see ragrag --help.

Configuration files

It is possible to set the defaults per directory via the config file. Ragrag will look for ragrag.json or .ragrag.json in the current working directory in that order; if not found, it will climb directory tree until one is found. All fields are optional.

{
  "index_path": ".ragrag",
  "model_id": "TomoroAI/tomoro-colqwen3-embed-4b",
  "max_visual_tokens": 16384,
  "top_k": 10,
  "max_top_k": 50,
  "pdf_dpi": 250,
  "ocr_threshold": 50,
  "chunk_size": 900,
  "chunk_overlap": 200,
  "include_hidden": false,
  "follow_symlinks": true,
  "indexing_timeout": 100000
}

Field Descriptions

index_path: Directory where the vector index and metadata are stored. Defaults to .ragrag. If a relative path is provided, it's resolved relative to the configuration file location.
model_id: The HuggingFace model identifier for the ColQwen3 embedding model. Defaults to TomoroAI/tomoro-colqwen3-embed-4b.
max_visual_tokens: Maximum number of visual tokens processed per image by the embedding model. Higher values capture more visual detail but use more GPU or CPU memory and increase embedding time.
top_k: The default number of search results to return for each query.
max_top_k: The maximum value that top_k can be set to. Requests for more results are capped at this value.
pdf_dpi: Resolution in dots per inch used when rendering PDF pages into images for multimodal indexing. Higher DPI improves detail for small text and diagrams but increases processing time.
ocr_threshold: Minimum character count for native PDF text before OCR fallback is skipped. Pages with fewer characters than this threshold are re-processed with Tesseract OCR to ensure visual content is indexed.
chunk_size: Target size for text chunks in characters. Large documents are split into these chunks to fit within the model context window.
chunk_overlap: The number of characters that overlap between consecutive text chunks. This ensures content spanning a chunk boundary is captured in both adjacent chunks, improving search recall for concepts that straddle chunk edges. A value of 0 means no overlap, which risks missing boundary content. 100-200 is typical.
include_hidden: Whether to include hidden files and directories, those starting with a dot, during indexing. Defaults to false.
follow_symlinks: Whether to follow symbolic links when discovering files. Default is true, meaning symlinks are followed. Set to false to prevent following symlinks and avoid cycles in recursive directory structures.
indexing_timeout: Soft timeout in seconds for the full indexing phase. When elapsed, remaining files are skipped. Default is 100000, which is effectively unlimited for normal use. Set lower for time-bounded operations.

Rationale

LLM coding agents working on embedded projects need fast semantic lookup over large local documentation sets:

source trees
PDF datasheets and reference manuals
image files with text/diagrams

Traditional text-only RAG misses diagram-heavy content (clock trees, pin mux diagrams, timing plots, block diagrams). We need a single search system that can retrieve both textual and visual evidence.

Why multimodal is mandatory

Electronics documents contain content that is not faithfully representable as plain text:

schematics
signal timing diagrams
clock trees
annotated block diagrams

Therefore, the MVP must index:

text
PDF pages as images
standalone images

Goals

Single local process, works anywhere even without GPU (given enough RAM).
No dependency on cloud inference providers.
Best feasible retrieval quality for visually rich technical docs.
Lazy indexing on first query and on change (index-on-demand).

Non-goals

Distributed indexing.
Guaranteed real-time indexing of all filesystem changes.
Cloud fallback.

Development

# Download validation corpus (PDFs, fixtures)
python scripts/fetch_validation_data.py

# Run tests (no model required)
pytest tests/test_validation.py -v

Project details

Release history Release notifications | RSS feed

This version

0.1.2

Apr 13, 2026

0.1.1

Apr 13, 2026

0.1.0

Apr 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragrag-0.1.2.tar.gz (39.9 kB view details)

Uploaded Apr 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragrag-0.1.2-py3-none-any.whl (31.1 kB view details)

Uploaded Apr 13, 2026 Python 3

File details

Details for the file ragrag-0.1.2.tar.gz.

File metadata

Download URL: ragrag-0.1.2.tar.gz
Upload date: Apr 13, 2026
Size: 39.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ragrag-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`c240eb3b7897ccf0e8038cc692dbfebf200fdaa9f4e547723d39a6120797474e`
MD5	`d0d52be566c421bad91db04dc7ae0ad0`
BLAKE2b-256	`713c42cc1a5129828931012f0017d917aeb03319d1ae90d0f957b18179bc3ded`

See more details on using hashes here.

File details

Details for the file ragrag-0.1.2-py3-none-any.whl.

File metadata

Download URL: ragrag-0.1.2-py3-none-any.whl
Upload date: Apr 13, 2026
Size: 31.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ragrag-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`13af206baa844afbda6afdf29823e85d5f3dbedd397b7d99d748215a2396d3ee`
MD5	`d172f9930486adbeebc23b98c8d80a70`
BLAKE2b-256	`89b53aefd11825473f752de321a8d25b59142c01332835b99d0d31d08e4693b4`

See more details on using hashes here.

ragrag 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Ragrag — local multimodal RAG for documents

Usage

Configuration files

Field Descriptions

Rationale

Why multimodal is mandatory

Goals

Non-goals

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes