A high-performance, local-first RAG document assistant.
Project description
🔬 Quaero
High-Performance, Local-First RAG Document Assistant
Transform your local documents into an intelligent, queryable knowledge base—without the bloat.
🎯 Overview
Quaero is a streamlined, local-first Retrieval-Augmented Generation (RAG) engine. Built for developers, researchers, and engineers, it completely bypasses heavy frameworks like LangChain in favor of a custom, memory-flat ingestion pipeline and blazing-fast vector search via LanceDB.
Your data never leaves your machine.
✨ The Engineering Edge
- Tiered Ingestion Router: Automatically routes files to the most efficient parser (e.g., C-bound
PyMuPDFfor PDFs, nativepython-docxfor Word, and raw streaming for code/text) while bouncing binary executables at the door. - Memory-Flat Processing: Reads and hashes massive files (like 1,000-page textbooks) using lazy generators, keeping your RAM usage practically at zero during ingestion.
- State Reconciliation: Native
synctracking detects when you modify or delete a physical file and automatically purges or updates the orphaned vectors via relational metadata. - Zero-Config Vector Search: Powered by LanceDB's PyArrow backend for native, sub-millisecond Cosine distance retrieval.
🚀 Quick Start
Prerequisites
- Python 3.11+
- Ollama installed and running locally.
1. Installation
Install directly via pip (or pipx for isolated environments):
pip install quaero
2. Initial Setup
Run the interactive wizard to configure your models and chunk sizes:
quaero setup
(We recommend embeddinggemma for embeddings and a fast, instruction-tuned model like gemma or llama3 for inference).
3. Build Your Knowledge Base
Point Quaero at a single file or an entire directory. It will recursively crawl and index supported formats.
quaero ingest /path/to/your/documents/
4. Start Querying
Launch the interactive terminal UI to chat with your documents:
quaero chat
Or execute a single-shot query:
quaero chat "What are the main persistence mechanisms described in the malware textbook?"
💻 CLI Command Reference
Quaero features a modern, Rich-powered CLI.
- quaero status - View database health and vector counts.
- quaero ingest
- quaero sync - Reconcile the vector database with your physical filesystem (purges orphans, updates modifications).
- quaero config show - Display active thresholds, models, and chunk parameters.
- quaero config set - Tune the engine on the fly (e.g., quaero config set score_threshold 0.6).
- quaero db reset - Nuke the database and start fresh.
🏗️ Architecture
graph TD
A[Local Filesystem] -->|quaero sync / ingest| B[Tiered Extraction Router]
B --> C[Memory-Flat Text Splitter]
C --> D[Ollama Embedding Engine]
D --> E[(LanceDB Vector Store)]
F[User Query] --> G[Cosine Similarity Search]
G --> E
E --> H[Context Assembly]
H --> I[Ollama Inference]
I --> J[Grounded Terminal Response]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file quaerite-0.1.1.tar.gz.
File metadata
- Download URL: quaerite-0.1.1.tar.gz
- Upload date:
- Size: 18.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
68c544f0566cf3f71a8d2c4df7fd6316fead242f83227208c25267c513f305d7
|
|
| MD5 |
7e8cbd5c974d40d76370b66334b4086f
|
|
| BLAKE2b-256 |
c219fc6a63c59f7ecc3904b2680fd926b4a2310ddcef18fb1e977b195f88ae7c
|
Provenance
The following attestation bundles were made for quaerite-0.1.1.tar.gz:
Publisher:
publish.yml on ADPer0705/quaero
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
quaerite-0.1.1.tar.gz -
Subject digest:
68c544f0566cf3f71a8d2c4df7fd6316fead242f83227208c25267c513f305d7 - Sigstore transparency entry: 1915880016
- Sigstore integration time:
-
Permalink:
ADPer0705/quaero@ff496f5f5a07bcdc95111c004f311f17231e7b37 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/ADPer0705
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ff496f5f5a07bcdc95111c004f311f17231e7b37 -
Trigger Event:
release
-
Statement type:
File details
Details for the file quaerite-0.1.1-py3-none-any.whl.
File metadata
- Download URL: quaerite-0.1.1-py3-none-any.whl
- Upload date:
- Size: 22.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba64a18ba18e863b6d7a069e61890021fa2fc36734f1c31821e1fbbb55dac8cb
|
|
| MD5 |
7a78b8d8922c40ccdc254d45317ad9c6
|
|
| BLAKE2b-256 |
db4cd2c3e1b5b1e38b9a07b0e99a4581bc391b1b57323d014485b3175d5718aa
|
Provenance
The following attestation bundles were made for quaerite-0.1.1-py3-none-any.whl:
Publisher:
publish.yml on ADPer0705/quaero
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
quaerite-0.1.1-py3-none-any.whl -
Subject digest:
ba64a18ba18e863b6d7a069e61890021fa2fc36734f1c31821e1fbbb55dac8cb - Sigstore transparency entry: 1915880323
- Sigstore integration time:
-
Permalink:
ADPer0705/quaero@ff496f5f5a07bcdc95111c004f311f17231e7b37 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/ADPer0705
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ff496f5f5a07bcdc95111c004f311f17231e7b37 -
Trigger Event:
release
-
Statement type: