Minimal local-first Retrieval-Augmented Generation (RAG) library using SQLite with sqlite-vec
Project description
softrag

Minimal local-first Retrieval-Augmented Generation (RAG) library powered by SQLite + sqlite-vec.
Everything—documents, embeddings, cache—lives in a single .db file.
🌟 Features
- Local-first – All processing happens locally, no external services.
- SQLite + sqlite-vec – Documents, embeddings, and cache in a single
.dbfile (no separate vector store or account needed). - No cloud service dependency – Plug in any LLM backend; no forced API keys for the core storage layer.
- Blazing-fast – Designed for minimal overhead and maximum throughput on small- and medium-scale corpora.
- Perfect for small & medium use cases – Ideal when you need a lightweight, self-contained RAG solution.
- Configurable chunking – Default
RecursiveCharacterTextSplitter(400/100) or your own strategy. - Model-agnostic – Works with OpenAI, Hugging Face, Ollama, etc.
- Zero heavy deps – Core pulls only minimal extras (
langchain-text-splittersoptional).
📋 Requirements
- Python 3.12+
- Dependencies: sqlite-vec, trafilatura, pymupdf (for PDFs)
- Access to embedding models and LLMs (uses OpenAI by default)
🚀 Installation
pip install softrag
🔧 Basic Usage
from softrag import Rag
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
chat = ChatOpenAI(model="gpt-4o")
embed = OpenAIEmbeddings(model="text-embedding-3-small")
rag = Rag(embed_model=embed, chat_model=chat) # default splitter (RCTS)
rag.add_file("document.pdf")
rag.add_web("https://example.com/page")
query = rag.query("What is the main information in this content?")
📚 Examples
See the examples/ folder for more detailed examples:
simple.py: Basic example with OpenAIlocal.py: Example using local Transformers models
🔄 How It Works
SoftRAG uses a hybrid approach for retrieval:
- Extraction: Content is extracted from documents and web pages
- Splitting: Text is divided into smaller chunks
- Indexing: Each chunk is indexed by text (SQLite FTS5) and vector embedding
- Retrieval: Queries combine keyword search and vector similarity
- Generation: The most relevant chunks are sent to the LLM along with the question
🤝 Contributing
Contributions are welcome! Please feel free to submit Pull Requests.
📜 License
This project is licensed under the MIT License - see the LICENSE file for details.
Developed with ❤️ for AI community
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file softrag-0.1.1.tar.gz.
File metadata
- Download URL: softrag-0.1.1.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56aca3958e61f4af4ead2eed8427bdfbf4307e5146f8341666819d9a6116e121
|
|
| MD5 |
91ca3f1cc5ea66063d65bea2bdbfac56
|
|
| BLAKE2b-256 |
64602b6dd98547b36ca8f3f0a842ae23800085d378f90ee87e87a5cb13e1e75f
|
File details
Details for the file softrag-0.1.1-py3-none-any.whl.
File metadata
- Download URL: softrag-0.1.1-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
11241987c6319414280a38d838e8f267595ec6907829aabaa9d34ce2f5ca447e
|
|
| MD5 |
a388224725325f9a074cb0aa97c5c0a0
|
|
| BLAKE2b-256 |
6521b7210eeaa075ca655530c2856262cb9f1a416fc0830bb80920e6d14e48fc
|