Ask questions about your documents using local LLMs — no cloud, no API keys
Project description
local-rag
Ask questions about your documents using local LLMs — no cloud, no API keys, your data stays on your machine.
$ rag add research-paper.pdf annual-report.docx notes.md
Loading research-paper.pdf…
128 chunks → embedding with nomic-embed-text…
Added 128 new chunks
Loading annual-report.docx…
94 chunks → embedding with nomic-embed-text…
Added 94 new chunks
$ rag ask "What were the main revenue drivers in Q3?"
╭─ Answer ────────────────────────────────────────────────────────╮
│ Based on the annual report, the main revenue drivers in Q3 │
│ were cloud services (+34% YoY) and professional services… │
│ [Source: annual-report.docx] │
╰─────────────────────────────────────────────────────────────────╯
Features
- Local-first — Ollama for embeddings + chat, ChromaDB for vector storage
- Multiple formats — PDF, DOCX, Markdown, plain text, RST
- Persistent store — add documents once, query forever
- Source filtering — restrict questions to specific files
- Smart chunking — overlapping word-based chunks for better context
- Rich terminal UI — markdown-rendered answers, source tables
Requirements
- Python ≥ 3.10
- Ollama running locally
- Embedding model:
ollama pull nomic-embed-text - Chat model:
ollama pull mistral
Installation
pip install local-rag
Or from source:
git clone https://github.com/dennisreichenberg/local-rag
cd local-rag
pip install -e ".[dev]"
Quick Start
# 1. Start Ollama (if not already running)
ollama serve
# 2. Pull required models
ollama pull nomic-embed-text
ollama pull mistral
# 3. Add documents
rag add report.pdf notes.md
# 4. Ask questions
rag ask "Summarize the key points"
rag ask "What are the risks mentioned?" --show-sources
Commands
rag add <files...>
Ingest one or more documents into the vector store. Supports .pdf, .docx, .txt, .md, .rst.
rag add report.pdf notes.md docs/
rag add report.pdf --embed-model nomic-embed-text --chunk-size 256
rag ask <question>
Ask a question. Retrieves the most relevant chunks and sends them to the LLM.
rag ask "What is the conclusion?"
rag ask "Explain the architecture" --chat-model llama3 --top-k 8
rag ask "What risks are mentioned?" --source report.pdf --show-sources
rag list
Show all ingested documents.
rag remove <source>
Remove a document (and all its chunks) from the store. Supports partial path matching.
rag remove report.pdf
rag clear
Remove everything from the store.
Options
| Command | Option | Default | Description |
|---|---|---|---|
add |
--embed-model |
nomic-embed-text |
Ollama embedding model |
add |
--chunk-size |
512 |
Words per chunk |
add |
--chunk-overlap |
64 |
Overlap between chunks |
ask |
--chat-model |
mistral |
Ollama chat model |
ask |
--embed-model |
nomic-embed-text |
Ollama embedding model |
ask |
--top-k |
5 |
Chunks to retrieve |
ask |
--source |
Filter by source file | |
ask |
--show-sources |
Show retrieved chunks | |
| All | --host |
http://localhost:11434 |
Ollama base URL |
How it works
Document → Chunking → Ollama Embeddings → ChromaDB
↓
Question → Ollama Embeddings → Vector Search → Top-K Chunks → Ollama LLM → Answer
- Ingestion: documents are split into overlapping chunks, embedded via Ollama (
nomic-embed-text), and stored in a local ChromaDB database (~/.local/share/local-rag/) - Retrieval: your question is embedded, then the closest chunks are retrieved via cosine similarity
- Generation: the retrieved chunks + question are sent to an Ollama chat model, which answers strictly from the provided context
Data storage
All data is stored locally at ~/.local/share/local-rag/chroma/. No data leaves your machine.
To move or backup your store, copy that directory.
License
MIT — see LICENSE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ollama_local_rag-0.1.0.tar.gz.
File metadata
- Download URL: ollama_local_rag-0.1.0.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
827332cb14c66f1b98e0c7178b26b1bb2c1480ba7ef9118a3c07c8004fdba1b1
|
|
| MD5 |
d9609febda1563733535c7361c2e75d0
|
|
| BLAKE2b-256 |
50ff728f71c3ba11dbafaa72f2467ec6888761057e95174d113a5ecea55970eb
|
Provenance
The following attestation bundles were made for ollama_local_rag-0.1.0.tar.gz:
Publisher:
ci.yml on dennisreichenberg/local-rag
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ollama_local_rag-0.1.0.tar.gz -
Subject digest:
827332cb14c66f1b98e0c7178b26b1bb2c1480ba7ef9118a3c07c8004fdba1b1 - Sigstore transparency entry: 1332885689
- Sigstore integration time:
-
Permalink:
dennisreichenberg/local-rag@712d2e363a324f7d9966a9844749778260991441 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/dennisreichenberg
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@712d2e363a324f7d9966a9844749778260991441 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ollama_local_rag-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ollama_local_rag-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d65d3e3bb64d81ecc77d2262911aa423584d2cfaa766538cdd1d4237719ffade
|
|
| MD5 |
230eefcfd37ae5fa9c39a6c94926cff5
|
|
| BLAKE2b-256 |
1cd40040658a0b17e015388872dfdad1fafac9fbe949cff823579c06be19476a
|
Provenance
The following attestation bundles were made for ollama_local_rag-0.1.0-py3-none-any.whl:
Publisher:
ci.yml on dennisreichenberg/local-rag
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ollama_local_rag-0.1.0-py3-none-any.whl -
Subject digest:
d65d3e3bb64d81ecc77d2262911aa423584d2cfaa766538cdd1d4237719ffade - Sigstore transparency entry: 1332885835
- Sigstore integration time:
-
Permalink:
dennisreichenberg/local-rag@712d2e363a324f7d9966a9844749778260991441 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/dennisreichenberg
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@712d2e363a324f7d9966a9844749778260991441 -
Trigger Event:
push
-
Statement type: