A feature-rich, universal RAG library for Python with ONNX-backed embeddings and DuckDB storage

These details have not been verified by PyPI

Project description

MicroRAG

A feature-rich, universal RAG library for Python with ONNX-backed embeddings and DuckDB storage.

Features

Flexible embedding backends - Choose between sentence-transformers (ONNX-optimized) or FastEmbed (lightweight)
DuckDB storage - Persistent vector storage with HNSW indexes for fast similarity search
Three-tier hybrid search - Combines semantic, BM25, and full-text search with RRF fusion
Query preprocessing - Abbreviation expansion and stopword removal for better search
Flexible document input - Accept strings, dicts, or Document objects
Text chunking - Automatic chunking with sentence boundary detection

Why ONNX?

MicroRAG uses ONNX (Open Neural Network Exchange) format for embedding models:

Faster inference - ONNX Runtime provides optimized CPU execution, often 2-3x faster than PyTorch
Smaller footprint - No need for full PyTorch/TensorFlow installation in production
Cross-platform - Same model runs on any platform without framework dependencies
Quantization support - Easy to use INT8/FP16 quantized models for even faster inference

Installation

# Core (no embedding backend - bring your own)
pip install microrag

# With sentence-transformers backend (ONNX-optimized)
pip install microrag[sentence-transformers]

# With FastEmbed backend (lightweight, fast)
pip install microrag[fastembed]

# All backends
pip install microrag[all]

# For CPU-only PyTorch (with sentence-transformers)
pip install microrag[sentence-transformers,cpu]

Quick Start

With sentence-transformers (local model)

from microrag import MicroRAG, RAGConfig

config = RAGConfig(
    model_path="/path/to/all-MiniLM-L6-v2",
    embedding_backend="sentence-transformers",  # or "auto"
    db_path="./rag.duckdb",
    embedding_dim=384,
)

with MicroRAG(config) as rag:
    # Add documents (strings, dicts, or Document objects)
    rag.add_documents([
        "Machine learning is a subset of artificial intelligence.",
        {"content": "Deep learning uses neural networks.", "metadata": {"source": "wiki"}},
    ])

    # Build search indexes
    rag.build_index()

    # Search
    results = rag.search("neural networks", top_k=5)
    for r in results:
        print(f"{r.score:.3f}: {r.content}")

With FastEmbed (auto-download)

from microrag import MicroRAG, RAGConfig

config = RAGConfig(
    model_path="BAAI/bge-small-en-v1.5",  # Model name, auto-downloaded
    embedding_backend="fastembed",
)

with MicroRAG(config) as rag:
    rag.add_documents(["Machine learning is a subset of AI."])
    rag.build_index()
    results = rag.search("neural networks")

Search Pipeline

MicroRAG uses a three-tier hybrid search architecture that combines multiple retrieval methods for better results:

Query: "ML techniques"
         │
         ▼
┌─────────────────────────────────────┐
│      Query Preprocessing            │
│  • Normalize whitespace             │
│  • Expand abbreviations (ML→machine │
│    learning)                        │
│  • Tokenize for BM25                │
└─────────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────┐
│      Parallel Search                │
│                                     │
│  ┌──────────┐  ┌──────────┐  ┌────────────┐
│  │ Semantic │  │  BM25    │  │    FTS     │
│  │  Search  │  │  Search  │  │   Search   │
│  │ (Vector) │  │(Keywords)│  │ (Stemmed)  │
│  └────┬─────┘  └────┬─────┘  └─────┬──────┘
│       │             │              │
│       ▼             ▼              ▼
│    Results       Results        Results
│   + scores      + scores       + scores
└─────────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────┐
│    Reciprocal Rank Fusion (RRF)     │
│                                     │
│  score = Σ 1/(k + rank_i)           │
│                                     │
│  Combines rankings from all methods │
│  with configurable weighting        │
└─────────────────────────────────────┘
         │
         ▼
      Final ranked results

Search Components

Semantic - HNSW vector similarity; understands meaning and context
BM25 - Term frequency scoring; exact keyword matching
FTS - DuckDB full-text search; stemming and linguistic matching

Why Hybrid Search?

Each search method has different strengths:

Semantic search finds conceptually similar documents even with different wording
BM25 excels at finding exact keyword matches
FTS handles word variations through stemming

By combining all three with RRF fusion, MicroRAG achieves better recall and precision than any single method alone.

Configuration

from microrag import RAGConfig

config = RAGConfig(
    # Embedding
    model_path="/path/to/model",      # Model path or name
    embedding_backend="auto",         # "auto", "sentence-transformers", "fastembed"

    # Storage
    db_path=":memory:",               # DuckDB path (":memory:" for in-memory)
    embedding_dim=384,                # Embedding vector dimension

    # Chunking
    chunk_size=1000,                  # Max characters per chunk
    chunk_overlap=200,                # Overlap between chunks

    # Search
    hybrid_enabled=True,              # Enable hybrid search
    hybrid_alpha=0.7,                 # Semantic weight (0-1)
    similarity_threshold=0.4,         # Min score threshold

    # Query processing
    abbreviations={"ML": "machine learning"},  # Query expansion
    remove_stopwords=True,            # Remove stopwords for BM25

    # HNSW tuning
    hnsw_ef_construction=200,         # Build-time parameter
    hnsw_ef_search=100,               # Search-time parameter
    hnsw_enable_persistence=False,    # Experimental index persistence
)

Configuration Options

Embedding:

model_path (str) - Model path (sentence-transformers) or model name (fastembed)
embedding_backend (str, default: "auto") - Backend: "auto", "sentence-transformers", "fastembed"
model_file (str, default: None) - ONNX filename (sentence-transformers only)
fastembed_cache_dir (str, default: None) - Cache directory (fastembed only)

Storage:

db_path (str, default: :memory:) - DuckDB database path
embedding_dim (int, default: 384) - Embedding vector dimension

Chunking:

chunk_size (int, default: 1000) - Text chunking size in characters
chunk_overlap (int, default: 200) - Overlap between chunks

Search:

hybrid_enabled (bool, default: True) - Enable hybrid search
hybrid_alpha (float, default: 0.7) - Semantic weight in fusion (0-1)
similarity_threshold (float, default: 0.4) - Minimum score to return

Query Processing:

abbreviations (dict, default: None) - Query expansion mapping
stopwords (set, default: English) - Stopwords for BM25 tokenization
remove_stopwords (bool, default: True) - Enable stopword removal

HNSW Tuning:

hnsw_ef_construction (int, default: 200) - HNSW build parameter
hnsw_ef_search (int, default: 100) - HNSW search parameter
hnsw_enable_persistence (bool, default: False) - Enable experimental HNSW index persistence

API Reference

MicroRAG

Main class for RAG operations.

from microrag import MicroRAG, RAGConfig

config = RAGConfig(model_path="/path/to/model")

# Use as context manager (recommended)
with MicroRAG(config) as rag:
    rag.add_documents([...])
    rag.build_index()
    results = rag.search("query")

# Or manage lifecycle manually
rag = MicroRAG(config)
try:
    # ... use rag
finally:
    rag.close()

Methods:

add_documents(docs, chunk=True) - Add documents (str, dict, or Document)
build_index() - Build HNSW, BM25, and FTS indexes
search(query, top_k=10, threshold=None, hybrid=None) - Search documents
get_document(doc_id) - Get document by ID
get_all_documents() - Get all documents
count() - Get document count
clear() - Remove all documents
close() - Close resources

Document

Document data model.

from microrag import Document

doc = Document(
    id="doc1",                    # Optional, auto-generated if not provided
    content="Document text...",   # Required
    metadata={"source": "wiki"},  # Optional metadata
)

SearchResult

Search result with score and document data.

results = rag.search("query")

for result in results:
    print(result.score)      # Similarity score
    print(result.content)    # Document content
    print(result.metadata)   # Document metadata
    print(result.document)   # Full Document object

Adding Documents

MicroRAG accepts documents in multiple formats:

# Strings
rag.add_documents([
    "First document content",
    "Second document content",
])

# Dicts with metadata
rag.add_documents([
    {"content": "Document text", "metadata": {"source": "file.txt"}},
    {"id": "custom_id", "content": "Another document"},
])

# Document objects
from microrag import Document

rag.add_documents([
    Document(id="doc1", content="Text", metadata={"key": "value"}),
])

# Disable chunking for pre-chunked content
rag.add_documents(["Already chunked text"], chunk=False)

Examples

See the examples/ directory for complete working examples:

basic_usage.py - Core workflow: adding documents, building indexes, searching
advanced_config.py - Custom abbreviations, hybrid search tuning, config variants
faq_search.py - FAQ/knowledge base search with metadata filtering

Run examples with:

make example name=basic_usage
make example name=advanced_config
make example name=faq_search

Development

# Clone and install
git clone https://github.com/yourname/microrag.git
cd microrag
uv sync --group dev

# Run tests
uv run pytest

# Run linting
uv run ruff check src/ tests/
uv run mypy src/

# Format code
uv run ruff format src/ tests/

License

MIT License - see LICENSE file.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.2

Jan 23, 2026

This version

0.2.1

Jan 23, 2026

0.2.0

Jan 23, 2026

0.1.0

Jan 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

microrag-0.2.1.tar.gz (120.1 kB view details)

Uploaded Jan 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

microrag-0.2.1-py3-none-any.whl (27.8 kB view details)

Uploaded Jan 23, 2026 Python 3

File details

Details for the file microrag-0.2.1.tar.gz.

File metadata

Download URL: microrag-0.2.1.tar.gz
Upload date: Jan 23, 2026
Size: 120.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for microrag-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`e3e7e2ee016dfaaaa721197f7ab1435577ea29b5e97f180312cda43fcfaec2b7`
MD5	`6b4d12b3d6293374fa8039e1a1772ffb`
BLAKE2b-256	`d38439284012e281d5874f70d1b8187e8fa258728777f05f51b3914a7684fce4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for microrag-0.2.1.tar.gz:

Publisher: publish.yml on bigbag/microrag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: microrag-0.2.1.tar.gz
- Subject digest: e3e7e2ee016dfaaaa721197f7ab1435577ea29b5e97f180312cda43fcfaec2b7
- Sigstore transparency entry: 845825264
- Sigstore integration time: Jan 23, 2026
Source repository:
- Permalink: bigbag/microrag@f608bc87368676fd83e463082c9293b69003c3d8
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/bigbag
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@f608bc87368676fd83e463082c9293b69003c3d8
- Trigger Event: push

File details

Details for the file microrag-0.2.1-py3-none-any.whl.

File metadata

Download URL: microrag-0.2.1-py3-none-any.whl
Upload date: Jan 23, 2026
Size: 27.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for microrag-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8472a87ee8a6e1d90b5a60ae4368a22848e2afd208031643fbafde73ec464361`
MD5	`93720722c8fdb75d778b0e4952777944`
BLAKE2b-256	`2eb3cccac433494b3e8ecd400a856c042c11b0db61d2909dfe81b2195786b109`

See more details on using hashes here.

Provenance

The following attestation bundles were made for microrag-0.2.1-py3-none-any.whl:

Publisher: publish.yml on bigbag/microrag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: microrag-0.2.1-py3-none-any.whl
- Subject digest: 8472a87ee8a6e1d90b5a60ae4368a22848e2afd208031643fbafde73ec464361
- Sigstore transparency entry: 845825267
- Sigstore integration time: Jan 23, 2026
Source repository:
- Permalink: bigbag/microrag@f608bc87368676fd83e463082c9293b69003c3d8
- Branch / Tag: refs/tags/v0.2.1
- Owner: https://github.com/bigbag
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@f608bc87368676fd83e463082c9293b69003c3d8
- Trigger Event: push

microrag 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

MicroRAG

Features

Why ONNX?

Installation

Quick Start

With sentence-transformers (local model)

With FastEmbed (auto-download)

Search Pipeline

Search Components

Why Hybrid Search?

Configuration

Configuration Options

API Reference

MicroRAG

Document

SearchResult

Adding Documents

Examples

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance