Skip to main content

MothRAG: adaptive multi-arm subset routing for multi-hop retrieval-augmented generation — high-level Python API with auto-defaults.

Project description

MothRAG logo

MOTHRAG

Training-free multi-hop question answering at research-SOTA parity — on commodity LLM APIs alone.

Author: Julian Geymonat · Research supported by: ItalySoft srl · License: Apache-2.0 · Paper: arXiv link pending · Zenodo preprint DOI: 10.5281/zenodo.20668567 · Site: mothrag.com

Every component — reader, embedder, retrieval judges — sits behind a commodity pay-per-call API. No local GPU, no constrained decoding, no non-commercially-licensed model. Deployment is a package install plus API keys.

Results (paper, n=1000 per dataset, Llama-3.3-70B reader, single uniform configuration)

System Deployment profile HotpotQA 2WikiMultiHopQA MuSiQue AVG
HippoRAG 2 (as published) offline OpenIE graph + NV-Embed-v2 peak 75.5 71.0 48.6 65.0
CoRAG (training-based, as reproduced) trained chain-retrieval 75.1 75.1 52.9 67.7
NeocorRAG (as published) GPU-bound constrained decoding + NV-Embed-v2 78.3 76.1 52.6 69.0
MOTHRAG (ours) commodity APIs only 78.1 76.3 50.5 68.3

F1, competitor numbers as published in the cited sources (same reader class). MOTHRAG attains the highest average F1 among commercially-deployable frameworks — within 0.7 points of the GPU-bound research state of the art (parity on HotpotQA at −0.2, an edge on 2WikiMultiHopQA at +0.2, an honest gap on MuSiQue at −2.1).

Measured cost: $0.032/query (reader + retrieval-judge, measured over 3,000 queries). A documented economy tier (one-flag retrieval-judge swap) runs at ≈$0.018/query (−44%) at statistical parity on HotpotQA and 2WikiMultiHopQA, with a measured trade-off only on MuSiQue (−2.12). All SOTA-parity claims attach to the full configuration.

Answers are proof-tree-structured: each output carries inspectable reasoning steps over the assembled evidence, with a γ-cap fallback when the grounding check cannot be satisfied within budget.

Install

pip install mothrag

# Recommended baseline (Gemini embeddings + Groq Llama-3.3-70B reader):
pip install 'mothrag[gemini,openai]'

# Full production stack:
pip install 'mothrag[prod]'
Extra Pulls Used by
gemini google-genai GeminiEmbedder, GeminiReader
openai openai OpenAIReader, GroqReader (Groq's OpenAI-compatible API)
sentence-transformers sentence-transformers local embedding fallback
retrieval scikit-learn, networkx, rank-bm25 classic-RAG features
faiss faiss-cpu vector store for 100k–10M chunk corpora
prod bundles the above + loaders full stack

Quickstart

from mothrag import MothRAG

# Works out of the box (degrades gracefully without keys);
# set GROQ_API_KEY + GEMINI_API_KEY for production quality.
m = MothRAG.from_documents([
    "Paris is the capital of France.",
    "The Eiffel Tower is in Paris.",
])
result = m.query("In which country is the Eiffel Tower?")
print(result.answer)         # the answer
print(result.arm_used)       # which reasoning arm won arbitration
print(result.confidence)     # arbitration confidence

API keys via environment (see .env.example): GROQ_API_KEY (reader), GEMINI_API_KEY (embedder + grounding judge), ANTHROPIC_API_KEY (premium retrieval judge; optional — the economy tier uses Gemini).

How it works

Three separable stages; every model invocation is a commodity API call:

  1. Bridge retrieval substrate — multi-query ANN fusion re-ranked by a tripartite LLM judge conditioned on retrieved bridge evidence, reshaping every retrieval (primary, sub-question, iterative). A post-retrieval ChainFilter re-scores the ranking by chain density over OpenIE triples, gated by input features only.
  2. Four-arm ensemble pool — direct read, decomposition, iterative refinement (γ-driven re-retrieval), and Pool-Duplicate Dispatch (PDD): a deterministic copy of the iterative arm's candidate that double-weights the grounding-checked voice in arbitration at zero extra inference. Pool cardinality is fixed at N=4 (five-arm pools regressed consistently).
  3. Deterministic arbitration — fixed weights over grounding status (γ, 1.0), cross-arm agreement (0.5), and faithfulness (0.3). No learned components anywhere; all gates condition on input features of the question, never on dataset identity.

Reproducing the paper

The paper numbers come from the evaluation configuration (scripts/route_prospective.py with the full flag set), not from the high-level quickstart API. See paper/REPRODUCE.md for the verbatim CLI, required inputs, and expected outputs. The per-query outputs behind every table in the paper are released in paper/results/ (six JSONs: three datasets × premium/economy tiers, n=1000 each).

Citing

See CITATION.cff. Until the arXiv ID is live, cite the Zenodo preprint:

@misc{geymonat2026mothrag,
  title  = {MOTHRAG: Training-Free Multi-Hop Question Answering at Research-SOTA Parity on Commodity LLM APIs},
  author = {Geymonat, Julian},
  year   = {2026},
  doi    = {10.5281/zenodo.20668567},
  url    = {https://doi.org/10.5281/zenodo.20668567}
}

License

Apache-2.0. © 2026 Julian Geymonat. Research supported by ItalySoft srl.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mothrag-0.5.0.tar.gz (463.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mothrag-0.5.0-py3-none-any.whl (365.5 kB view details)

Uploaded Python 3

File details

Details for the file mothrag-0.5.0.tar.gz.

File metadata

  • Download URL: mothrag-0.5.0.tar.gz
  • Upload date:
  • Size: 463.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for mothrag-0.5.0.tar.gz
Algorithm Hash digest
SHA256 baa84ea8b2248eb0fb69f95a59a75561f15b5b793d41bfa3af122cb3a9db02cc
MD5 3f9229e52c1a35acacbd1ec1b9fd2e28
BLAKE2b-256 b61cca3c1129c5b57b7963ec080a199a8daa1bce04b83a0d5fc634c5ac51b6ea

See more details on using hashes here.

File details

Details for the file mothrag-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: mothrag-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 365.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for mothrag-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b4e5bbcc9fd8e023ea9051291e9b910e9538381913a7f2d1931f67ed6a9b3e68
MD5 5e064d77fb896cfcd724496dedc466c8
BLAKE2b-256 c9dbb8e67743e08e43a12a704910bbf6421a18a191e39a01fcfff8d087ff2814

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page