Skip to main content

Local-first Python SDK for adaptive document retrieval.

Project description

Motion RAG

Motion RAG is a local-first Python SDK for adaptive document retrieval.

This repository currently contains the package foundation for V1:

  • package metadata and installable layout
  • shared domain and config models
  • core protocols for later phases
  • a narrow MotionRAG public facade
  • logging and test scaffolding
  • deterministic chunk contextualization
  • vector embeddings with OpenRouter integration
  • local-first in-memory SQLite indexing primitives
  • optional pgvector and MongoDB vector backends

The implementation intentionally grows one phase at a time so later retrieval and generation layers can build on stable models and inspectable metadata.

To exercise the current pipeline against a real local file or directory, run:

python3 scripts/test_real_file.py /path/to/file-or-folder

Add --json for structured output.

To benchmark retrieval and answer latency against real local files, run:

python3 scripts/benchmark_real_queries.py /path/to/file-or-folder --dataset queries.json

To benchmark the bundled sample PDFs with a ready-made 4-question dataset and detailed per-query evidence output, run:

python3 scripts/benchmark_pdf_embeddings.py

To compare local fallback embeddings against OpenRouter embeddings on the same query set, run:

python3 scripts/benchmark_real_queries.py /path/to/file-or-folder --dataset queries.json --compare-embeddings

For stronger PDF structure extraction, install Docling:

python3 -m pip install -e .

Motion RAG now uses Docling for PDF parsing.

python3 scripts/test_real_file.py /path/to/file.pdf

The Docling path maps the structured DoclingDocument body tree directly into Motion RAG sections and blocks, skipping picture content for now while preserving cleaner body text, list, table, and page provenance signals.

For vector embeddings, Motion RAG uses OpenRouter when OPENROUTER_API_KEY is set. Without a key, it falls back to a deterministic local embedding implementation so tests and offline runs still work.

Vector storage defaults to in-memory SQLite for local runs. For persistent backends, set index.vector_backend to pgvector or mongo and provide the matching connection settings in IndexConfig.

Reranker

Motion RAG includes two rerankers:

  1. ScoreFusionReranker (default) — deterministic fusion of vector, sparse, lexical, and context signals. Zero model loading, zero API calls.
  2. CrossEncoderReranker (optional) — neural cross-encoder via sentence-transformers for stronger ranking quality.

Enable the cross-encoder reranker

from motion_rag import MotionRAG, MotionRAGConfig, IndexConfig

config = MotionRAGConfig(
    index=IndexConfig(
        reranker_model="cross-encoder/ms-marco-MiniLM-L-4-v2",
        reranker_batch_size=32,
        reranker_max_candidates=50,
    )
)
rag = MotionRAG(config)

The default cross-encoder is cross-encoder/ms-marco-MiniLM-L-4-v2 (~23 MB, fast on CPU). A higher-quality alternative is cross-encoder/ms-marco-MiniLM-L-6-v2 (~80 MB).

If reranker_model is left empty, the pipeline falls back to ScoreFusionReranker.

Benchmark the reranker in isolation

To measure reranker latency and ranking quality without embedding or index-build noise:

# Using a query dataset
python3 scripts/benchmark_reranker.py /path/to/file-or-folder --dataset queries.json --repeats 10

# Inline queries
python3 scripts/benchmark_reranker.py /path/to/file-or-folder --query "What are the refund rules?" --repeats 10

# Skip cross-encoder (fast smoke test)
python3 scripts/benchmark_reranker.py /path/to/file-or-folder --dataset queries.json --no-cross-encoder

# Custom cross-encoder model
python3 scripts/benchmark_reranker.py /path/to/file-or-folder --dataset queries.json --cross-encoder-model cross-encoder/ms-marco-MiniLM-L-6-v2

# Structured JSON output
python3 scripts/benchmark_reranker.py /path/to/file-or-folder --dataset queries.json --json

The isolated benchmark:

  • ingests documents once
  • generates a fixed candidate pool via vector + sparse retrieval
  • reranks the same candidates through every configured reranker
  • reports mean / p95 / p99 latency, ranking stability, and recall metrics

CLI

motion-rag ingest ./docs --index docs
motion-rag query docs "What are the refund rules?"
motion-rag retrieve docs "What are the refund rules?" --debug
motion-rag eval docs --dataset eval.json

SDK

from motion_rag import MotionRAG, MotionRAGConfig

rag = MotionRAG(MotionRAGConfig())
rag.index.create(name="docs", source="./docs")
result = rag.query.ask(index="docs", question="What are the refund rules?")

Docs

The VitePress source lives in docs/.

Publishing

Releases are tagged automatically from main, then published to PyPI from the GitHub release workflow.

After release, install with:

pip install motion-rag

Benchmark Results

Run on 2 PDFs (217 chunks, 4 queries) using scripts/benchmark_quality.py with Gemini 2.5 Flash Lite as quality evaluator.

Master Summary — Latency, Recall & Quality

Variant                       Build  Chunks  R_mean    R_p95     R@1  R@3  R@5  MRR    Qual  Rel  Comp  Backend
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
SQLite + HashEmbed (256d)      3.0s   217    693.5ms  749.8ms  1.00 1.00 1.00 1.0000  5.00 5.00 5.00  sqlite+hnsw
SQLite + OpenRouter (1536d)    2.7s   217    656.5ms  708.5ms  1.00 1.00 1.00 1.0000  5.00 5.00 5.00  sqlite+hnsw
pgvector HNSW + OR (1536d)     4.1s   217    488.3ms  550.3ms  1.00 1.00 1.00 1.0000  4.00 4.00 4.00  pgvector+hnsw
pgvector brute + OR (1536d)    3.6s   217    540.9ms  783.3ms  1.00 1.00 1.00 1.0000  4.00 4.00 4.00  pgvector+none
HNSW + reranker OFF            4.2s   217    529.6ms  537.9ms  1.00 1.00 1.00 1.0000  4.00 4.00 4.00  pgvector+hnsw
SQLite + ST (384d) ──★──      2.7s   217    617.7ms  634.6ms  1.00 1.00 1.00 1.0000  5.00 5.00 5.00  sqlite+hnsw
pgvec HNSW + ST (384d) ──★──  3.8s   217    506.3ms  581.4ms  1.00 1.00 1.00 1.0000  4.00 4.00 4.00  pgvector+hnsw

★ = sbert integration (local sentence-transformers, no API calls)

Chunking Strategy Comparison

Strategy         Chunks  Build   R_mean    R@3  MRR    Quality  Ev
────────────────────────────────────────────────────────────────────
section            147   2.5s    602.1ms  1.00 1.0000  5.00     6
recursive_token    132   2.0s    594.1ms  1.00 1.0000  5.00     6
parent_child       282   3.2s    649.9ms  1.00 1.0000  5.00     7
auto               217   3.0s    734.6ms  1.00 1.0000  5.00     7

Speedup vs SQLite HashEmbedding (baseline)

Variant                          Latency    Speedup  Savings
──────────────────────────────────────────────────────────────
SQLite + HashEmbed (256d)        693.5ms    1.00x     0.0%
SQLite + OpenRouter (1536d)      656.5ms    1.06x     5.3%
pgvector HNSW + OR (1536d)       488.3ms    1.42x    29.6%  ← fastest
pgvector brute + OR (1536d)      540.9ms    1.28x    22.0%
HNSW + reranker OFF              529.6ms    1.31x    23.6%
SQLite + ST (384d)               617.7ms    1.12x    10.9%  ← zero network
pgvector HNSW + ST (384d)        506.3ms    1.37x    27.0%

Per-Query Quality (overall 1-5)

Question                          Hash   SQL+OR  pg+HNSW  ST    pg+ST
──────────────────────────────────────────────────────────────────────
Sofitel Strasbourg?                5      5       5        5     5
Sofitel eco-cert 2025?             5      5       5        5     5
Barton Hills home size?            5      5       5        5     5
Email for guidance?                5      5       1        5     1

Algorithm Comparison — Key Metrics

Metric                  HashEmbed  SQL+OR   pg+HNSW  pg+brute  NoRerank  Best
──────────────────────────────────────────────────────────────────────────────
Retrieve Mean            693.5ms   656.5ms  488.3ms  540.9ms   529.6ms   → pgvector HNSW
Retrieve P95             749.8ms   708.5ms  550.3ms  783.3ms   537.9ms   → HNSW + NO rerank
Build Time                 3.0s      2.7s     4.1s     3.6s      4.2s    → SQLite + OpenRouter
Recall@1/3/5/MRR          1.00      1.00     1.00     1.00      1.00     → ALL EQUAL
Quality (Overall)         5.00      5.00     4.00     4.00      4.00     → SQLite variants

Key Takeaways

Insight Data
pgvector HNSW is fastest 488ms mean (1.42x speedup), lowest P95 jitter
All backends have equal recall R@1/3/5 = 1.00, MRR = 1.000 for every variant
sbert local embedding = zero network 618ms mean, same quality as OpenRouter, no API key needed
Cross-encoder reranker improves ranking But adds ~300-800ms on first run (model loading). Cached afterwards
Chunking strategy doesn't affect quality Section, recursive, parent-child all achieve 5.0 quality
Reranker off saves ~3% With negligible quality difference on simple queries
pgvector has a quality blind spot Scores 1 vs 5 on 1/4 queries — likely <=> operator encoding difference

Reranker Benchmark Results

Run on 2 PDFs (217 chunks, 4 queries, 10 repeats) using scripts/benchmark_reranker.py in isolated mode — the same candidate pool is fed to every reranker, so latency differences are purely from the reranking step.

Isolated Reranker Latency

Reranker                             Mean      P95       P99      Stdev   R@1  R@3  R@5  MRR    Stab
────────────────────────────────────────────────────────────────────────────────────────────────────
Noop (passthrough)                  0.00ms    0.00ms    0.00ms    0.00ms  1.00 1.00 1.00 1.0000 1.00
ScoreFusion (deterministic)         0.78ms    0.98ms    0.98ms    0.10ms  1.00 1.00 1.00 1.0000 1.00
CrossEncoder (ms-marco-MiniLM-L-4) 539.12ms 4240.21ms 4240.21ms 1302.88ms 1.00 1.00 1.00 1.0000 1.00

CrossEncoder first run includes model download / load (~1.8 s). Warm queries settle to ~110-130 ms.

Key Takeaways

Insight Data
ScoreFusion is effectively free <1 ms mean, zero dependencies
CrossEncoder adds ~100-130 ms warm First run dominated by model loading
All rerankers have perfect recall on this set R@1/3/5 = 1.00, MRR = 1.0000
Rankings are perfectly stable Stability = 1.00 across all rerankers (no jitter)

Run the Benchmark Yourself

# Full quality benchmark (requires OpenRouter API key)
python3 scripts/benchmark_quality.py

# Latency + recall benchmark
python3 scripts/benchmark_comprehensive.py

# Reranker-only benchmark (isolated latency & ranking quality)
python3 scripts/benchmark_reranker.py /path/to/file-or-folder --dataset queries.json --repeats 10

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

motion_rag-0.3.0.tar.gz (64.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

motion_rag-0.3.0-py3-none-any.whl (75.3 kB view details)

Uploaded Python 3

File details

Details for the file motion_rag-0.3.0.tar.gz.

File metadata

  • Download URL: motion_rag-0.3.0.tar.gz
  • Upload date:
  • Size: 64.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for motion_rag-0.3.0.tar.gz
Algorithm Hash digest
SHA256 d0c0b5a17de1f7d875742d84a057843e976929fdcaad77ff29806ba699535bf2
MD5 53421d90433e9d49a0d732379786b0fb
BLAKE2b-256 394c682c9a22d09cff9da454afb27e0137c58778602f7230cffa17f6c33cc359

See more details on using hashes here.

Provenance

The following attestation bundles were made for motion_rag-0.3.0.tar.gz:

Publisher: release.yml on Rvey/motion_rag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file motion_rag-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: motion_rag-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 75.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for motion_rag-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 981b074b04c08c456368b583b89b5469d7a0e967e2af2f2105f149915566c2d8
MD5 f1f1a645eb34087774f6507279e91674
BLAKE2b-256 1c01a66083f5c1a8c05be560542b2636229e604d98306d4e29daa7a34dda9053

See more details on using hashes here.

Provenance

The following attestation bundles were made for motion_rag-0.3.0-py3-none-any.whl:

Publisher: release.yml on Rvey/motion_rag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page