MothRAG: adaptive multi-arm subset routing for multi-hop retrieval-augmented generation — high-level Python API with auto-defaults.

These details have not been verified by PyPI

Project links

Project description

MothRAG logo

MOTHRAG

Training-free multi-hop question answering at research-SOTA parity — on commodity LLM APIs alone.

Author: Julian Geymonat · Research supported by: ItalySoft srl · License: Apache-2.0 · Paper: arXiv link pending · Zenodo preprint DOI: 10.5281/zenodo.20668567 · Site: mothrag.com

Every component — reader, embedder, retrieval judges — sits behind a commodity pay-per-call API. No local GPU, no constrained decoding, no non-commercially-licensed model. Deployment is a package install plus API keys.

Results (paper, n=1000 per dataset, Llama-3.3-70B reader, single uniform configuration)

System	Deployment profile	HotpotQA	2WikiMultiHopQA	MuSiQue	AVG
HippoRAG 2 (as published)	offline OpenIE graph + NV-Embed-v2 peak	75.5	71.0	48.6	65.0
CoRAG (training-based, as reproduced)	trained chain-retrieval	75.1	75.1	52.9	67.7
NeocorRAG (as published)	GPU-bound constrained decoding + NV-Embed-v2	78.3	76.1	52.6	69.0
MOTHRAG (ours)	commodity APIs only	78.1	76.3	50.5	68.3

F1, competitor numbers as published in the cited sources (same reader class). MOTHRAG attains the highest average F1 among commercially-deployable frameworks — within 0.7 points of the GPU-bound research state of the art (parity on HotpotQA at −0.2, an edge on 2WikiMultiHopQA at +0.2, an honest gap on MuSiQue at −2.1).

Measured cost: $0.032/query (reader + retrieval-judge, measured over 3,000 queries). A documented economy tier (one-flag retrieval-judge swap) runs at ≈$0.018/query (−44%) at statistical parity on HotpotQA and 2WikiMultiHopQA, with a measured trade-off only on MuSiQue (−2.12). All SOTA-parity claims attach to the full configuration.

Answers are proof-tree-structured: each output carries inspectable reasoning steps over the assembled evidence, with a γ-cap fallback when the grounding check cannot be satisfied within budget.

Install

pip install mothrag

# Recommended baseline (Gemini embeddings + Groq Llama-3.3-70B reader):
pip install 'mothrag[gemini,openai]'

# Full production stack:
pip install 'mothrag[prod]'

Extra	Pulls	Used by
`gemini`	`google-genai`	`GeminiEmbedder`, `GeminiReader`
`openai`	`openai`	`OpenAIReader`, `GroqReader` (Groq's OpenAI-compatible API)
`sentence-transformers`	`sentence-transformers`	local embedding fallback
`retrieval`	`scikit-learn`, `networkx`, `rank-bm25`	classic-RAG features
`faiss`	`faiss-cpu`	vector store for 100k–10M chunk corpora
`prod`	bundles the above + loaders	full stack

Quickstart

from mothrag import MothRAG

# Works out of the box (degrades gracefully without keys);
# set GROQ_API_KEY + GEMINI_API_KEY for production quality.
m = MothRAG.from_documents([
    "Paris is the capital of France.",
    "The Eiffel Tower is in Paris.",
])
result = m.query("In which country is the Eiffel Tower?")
print(result.answer)         # the answer
print(result.arm_used)       # which reasoning arm won arbitration
print(result.confidence)     # arbitration confidence

API keys via environment (see .env.example): GROQ_API_KEY (reader), GEMINI_API_KEY (embedder + grounding judge), ANTHROPIC_API_KEY (premium retrieval judge; optional — the economy tier uses Gemini).

How it works

Three separable stages; every model invocation is a commodity API call:

Bridge retrieval substrate — multi-query ANN fusion re-ranked by a tripartite LLM judge conditioned on retrieved bridge evidence, reshaping every retrieval (primary, sub-question, iterative). A post-retrieval ChainFilter re-scores the ranking by chain density over OpenIE triples, gated by input features only.
Four-arm ensemble pool — direct read, decomposition, iterative refinement (γ-driven re-retrieval), and Pool-Duplicate Dispatch (PDD): a deterministic copy of the iterative arm's candidate that double-weights the grounding-checked voice in arbitration at zero extra inference. Pool cardinality is fixed at N=4 (five-arm pools regressed consistently).
Deterministic arbitration — fixed weights over grounding status (γ, 1.0), cross-arm agreement (0.5), and faithfulness (0.3). No learned components anywhere; all gates condition on input features of the question, never on dataset identity.

Reproducing the paper

The paper numbers come from the evaluation configuration (scripts/route_prospective.py with the full flag set), not from the high-level quickstart API. See paper/REPRODUCE.md for the verbatim CLI, required inputs, and expected outputs. The per-query outputs behind every table in the paper are released in paper/results/ (six JSONs: three datasets × premium/economy tiers, n=1000 each).

Citing

See CITATION.cff. Until the arXiv ID is live, cite the Zenodo preprint:

@misc{geymonat2026mothrag,
  title  = {MOTHRAG: Training-Free Multi-Hop Question Answering at Research-SOTA Parity on Commodity LLM APIs},
  author = {Geymonat, Julian},
  year   = {2026},
  doi    = {10.5281/zenodo.20668567},
  url    = {https://doi.org/10.5281/zenodo.20668567}
}

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.5.0

Jun 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mothrag-0.5.0.tar.gz (463.9 kB view details)

Uploaded Jun 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mothrag-0.5.0-py3-none-any.whl (365.5 kB view details)

Uploaded Jun 17, 2026 Python 3

File details

Details for the file mothrag-0.5.0.tar.gz.

File metadata

Download URL: mothrag-0.5.0.tar.gz
Upload date: Jun 17, 2026
Size: 463.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for mothrag-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`baa84ea8b2248eb0fb69f95a59a75561f15b5b793d41bfa3af122cb3a9db02cc`
MD5	`3f9229e52c1a35acacbd1ec1b9fd2e28`
BLAKE2b-256	`b61cca3c1129c5b57b7963ec080a199a8daa1bce04b83a0d5fc634c5ac51b6ea`

See more details on using hashes here.

File details

Details for the file mothrag-0.5.0-py3-none-any.whl.

File metadata

Download URL: mothrag-0.5.0-py3-none-any.whl
Upload date: Jun 17, 2026
Size: 365.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for mothrag-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b4e5bbcc9fd8e023ea9051291e9b910e9538381913a7f2d1931f67ed6a9b3e68`
MD5	`5e064d77fb896cfcd724496dedc466c8`
BLAKE2b-256	`c9dbb8e67743e08e43a12a704910bbf6421a18a191e39a01fcfff8d087ff2814`

See more details on using hashes here.

mothrag 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MOTHRAG

Results (paper, n=1000 per dataset, Llama-3.3-70B reader, single uniform configuration)

Install

Quickstart

How it works

Reproducing the paper

Citing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes