MothRAG: adaptive multi-arm subset routing for multi-hop retrieval-augmented generation — high-level Python API with auto-defaults.
Project description
MOTHRAG
Training-free multi-hop question answering at research-SOTA parity — on commodity LLM APIs alone.
Author: Julian Geymonat · Research supported by: ItalySoft srl · License: Apache-2.0 · Paper: arXiv link pending · Zenodo preprint DOI: 10.5281/zenodo.20668567 · Site: mothrag.com
Every component — reader, embedder, retrieval judges — sits behind a commodity pay-per-call API. No local GPU, no constrained decoding, no non-commercially-licensed model. Deployment is a package install plus API keys.
Results (paper, n=1000 per dataset, Llama-3.3-70B reader, single uniform configuration)
| System | Deployment profile | HotpotQA | 2WikiMultiHopQA | MuSiQue | AVG |
|---|---|---|---|---|---|
| HippoRAG 2 (as published) | offline OpenIE graph + NV-Embed-v2 peak | 75.5 | 71.0 | 48.6 | 65.0 |
| CoRAG (training-based, as reproduced) | trained chain-retrieval | 75.1 | 75.1 | 52.9 | 67.7 |
| NeocorRAG (as published) | GPU-bound constrained decoding + NV-Embed-v2 | 78.3 | 76.1 | 52.6 | 69.0 |
| MOTHRAG (ours) | commodity APIs only | 78.1 | 76.3 | 50.5 | 68.3 |
F1, competitor numbers as published in the cited sources (same reader class). MOTHRAG attains the highest average F1 among commercially-deployable frameworks — within 0.7 points of the GPU-bound research state of the art (parity on HotpotQA at −0.2, an edge on 2WikiMultiHopQA at +0.2, an honest gap on MuSiQue at −2.1).
Measured cost: $0.032/query (reader + retrieval-judge, measured over 3,000 queries). A documented economy tier (one-flag retrieval-judge swap) runs at ≈$0.018/query (−44%) at statistical parity on HotpotQA and 2WikiMultiHopQA, with a measured trade-off only on MuSiQue (−2.12). All SOTA-parity claims attach to the full configuration.
Answers are proof-tree-structured: each output carries inspectable reasoning steps over the assembled evidence, with a γ-cap fallback when the grounding check cannot be satisfied within budget.
Install
pip install mothrag
# Recommended baseline (Gemini embeddings + Groq Llama-3.3-70B reader):
pip install 'mothrag[gemini,openai]'
# Full production stack:
pip install 'mothrag[prod]'
| Extra | Pulls | Used by |
|---|---|---|
gemini |
google-genai |
GeminiEmbedder, GeminiReader |
openai |
openai |
OpenAIReader, GroqReader (Groq's OpenAI-compatible API) |
sentence-transformers |
sentence-transformers |
local embedding fallback |
retrieval |
scikit-learn, networkx, rank-bm25 |
classic-RAG features |
faiss |
faiss-cpu |
vector store for 100k–10M chunk corpora |
prod |
bundles the above + loaders | full stack |
Quickstart
from mothrag import MothRAG
# Works out of the box (degrades gracefully without keys);
# set GROQ_API_KEY + GEMINI_API_KEY for production quality.
m = MothRAG.from_documents([
"Paris is the capital of France.",
"The Eiffel Tower is in Paris.",
])
result = m.query("In which country is the Eiffel Tower?")
print(result.answer) # the answer
print(result.arm_used) # which reasoning arm won arbitration
print(result.confidence) # arbitration confidence
API keys via environment (see .env.example): GROQ_API_KEY (reader), GEMINI_API_KEY (embedder + grounding judge), ANTHROPIC_API_KEY (premium retrieval judge; optional — the economy tier uses Gemini).
How it works
Three separable stages; every model invocation is a commodity API call:
- Bridge retrieval substrate — multi-query ANN fusion re-ranked by a tripartite LLM judge conditioned on retrieved bridge evidence, reshaping every retrieval (primary, sub-question, iterative). A post-retrieval ChainFilter re-scores the ranking by chain density over OpenIE triples, gated by input features only.
- Four-arm ensemble pool — direct read, decomposition, iterative refinement (γ-driven re-retrieval), and Pool-Duplicate Dispatch (PDD): a deterministic copy of the iterative arm's candidate that double-weights the grounding-checked voice in arbitration at zero extra inference. Pool cardinality is fixed at N=4 (five-arm pools regressed consistently).
- Deterministic arbitration — fixed weights over grounding status (γ, 1.0), cross-arm agreement (0.5), and faithfulness (0.3). No learned components anywhere; all gates condition on input features of the question, never on dataset identity.
Reproducing the paper
The paper numbers come from the evaluation configuration (scripts/route_prospective.py with the full flag set), not from the high-level quickstart API. See paper/REPRODUCE.md for the verbatim CLI, required inputs, and expected outputs. The per-query outputs behind every table in the paper are released in paper/results/ (six JSONs: three datasets × premium/economy tiers, n=1000 each).
Citing
See CITATION.cff. Until the arXiv ID is live, cite the Zenodo preprint:
@misc{geymonat2026mothrag,
title = {MOTHRAG: Training-Free Multi-Hop Question Answering at Research-SOTA Parity on Commodity LLM APIs},
author = {Geymonat, Julian},
year = {2026},
doi = {10.5281/zenodo.20668567},
url = {https://doi.org/10.5281/zenodo.20668567}
}
License
Apache-2.0. © 2026 Julian Geymonat. Research supported by ItalySoft srl.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mothrag-0.5.0.tar.gz.
File metadata
- Download URL: mothrag-0.5.0.tar.gz
- Upload date:
- Size: 463.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
baa84ea8b2248eb0fb69f95a59a75561f15b5b793d41bfa3af122cb3a9db02cc
|
|
| MD5 |
3f9229e52c1a35acacbd1ec1b9fd2e28
|
|
| BLAKE2b-256 |
b61cca3c1129c5b57b7963ec080a199a8daa1bce04b83a0d5fc634c5ac51b6ea
|
File details
Details for the file mothrag-0.5.0-py3-none-any.whl.
File metadata
- Download URL: mothrag-0.5.0-py3-none-any.whl
- Upload date:
- Size: 365.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4e5bbcc9fd8e023ea9051291e9b910e9538381913a7f2d1931f67ed6a9b3e68
|
|
| MD5 |
5e064d77fb896cfcd724496dedc466c8
|
|
| BLAKE2b-256 |
c9dbb8e67743e08e43a12a704910bbf6421a18a191e39a01fcfff8d087ff2814
|