Skip to main content

Advanced Retrieval Framework — dependency-free retrieval pipeline toolkit

Project description

ARF — Advanced Retrieval Framework

CI PyPI Python 3.10+ License: MIT

A zero-dependency retrieval pipeline toolkit. Plug in your own vector search, embedding model, LLM, ML model, and database — ARF provides the routing algorithms, feature engineering, rephrase-graph caching, and score blending.

pip install advanced-rag-framework

What ARF Does

Most RAG pipelines send every candidate to an expensive LLM for reranking. ARF eliminates this waste with a multi-stage filtering pipeline called R-Flow:

Query
  → Cache graph walk (free — returns instantly if seen before)
  → Vector search (your provider)
  → Threshold + gap filter (free — drops obvious junk)
  → MLP triage (free, <5ms — accept/reject/uncertain)
  → LLM verification ($$$ — only for the ~20% uncertain candidates)
  → Answer with summaries

Each stage filters candidates so the next stage does less work. Only the uncertain ~20% ever reach the LLM.

Quick Start

from arf import Pipeline, DocumentConfig, Triage

pipeline = Pipeline(
    doc_config=DocumentConfig(title_field="title", text_fields=["text"]),
    triage=Triage(min_score=0.65, accept_threshold=0.85, verify_threshold=0.70),
    search_fn=my_search,       # (embedding, top_k) → [(dict, float)]
    embed_fn=my_embed,         # (text) → [float]
)

results = pipeline.run("how does caching work?")

That's it. Two required functions. Everything else is optional.

Full Pipeline

from arf import Pipeline, DocumentConfig, Triage
from arf.trainer import load_reranker

pipeline = Pipeline(
    doc_config=DocumentConfig(
        title_field="title",
        text_fields=["text", "summary"],
        children_fields=["sections", "clauses"],
        hierarchy=["title", "chapter", "section"],
    ),
    triage=Triage(
        min_score=0.65,
        accept_threshold=0.85,
        verify_threshold=0.70,
        gap=0.20,
    ),

    # Required
    search_fn=my_search,           # any vector DB
    embed_fn=my_embed,             # any embedding model

    # Scoring (optional)
    predict_fn=load_reranker("model.joblib"),  # trained MLP
    llm_fn=my_llm_verify,         # any LLM

    # Cache (optional)
    cache_lookup=my_cache_get,     # any cache backend
    cache_store=my_cache_set,

    # Preprocessing (optional)
    preprocess_fn=my_clean,        # translate, normalize, etc.
    moderate_fn=my_moderate,       # content safety
    rephrase_fn=my_rephrase,       # retry with rephrased query

    # Hierarchy (optional)
    resolve_fn=my_get_parent,      # walk up document tree
    summarize_fn=my_summarize,     # generate answer
)

results = pipeline.run("what is due process?", top_k=5)
# [{"document": Document, "score": 0.94, "context": [...], "summary": "..."}, ...]

Components

ARF is 6 independent modules. Use them together or individually.

Document — DB-agnostic data model

from arf import Document, DocumentConfig

config = DocumentConfig(
    title_field="name",
    text_fields=["body", "content"],
    children_fields=["subsections"],
    hierarchy=["category", "name"],
)

doc = Document.from_dict({"name": "Guide", "body": "...", "category": "Medical"}, config)
# doc.depth = 2, doc.path = "Medical / Guide"

Works with any database. MongoDB, PostgreSQL, DynamoDB, Pinecone, FAISS — just map your fields.

Features — 15-feature extraction

from arf import FeatureExtractor

extractor = FeatureExtractor(config)
features = extractor.extract_features(query="...", document={...}, semantic_score=0.85)
vector = extractor.to_vector(features)  # [0.85, 4.2, 0, 0, ...]
Feature Description
semantic_score Raw cosine similarity from vector search
bm25_score Term-frequency relevance approximation
alias_match Whether query matches a document alias
keyword_match Whether query matches via keyword pattern
domain_type Encoded domain identifier
document_length Log-scaled character count
query_length Query character count
section_depth Depth in document hierarchy
embedding_cosine_similarity Direct embedding cosine similarity
match_type 0=none, 1=partial, 2=exact
score_gap_from_top Gap from highest-scored document
query_term_coverage Fraction of query terms in document
title_similarity Jaccard similarity between query and title
has_nested_content Whether document has children
bias_adjustment Configurable per-document bias

Triage — threshold + gap + zone routing

from arf import Triage

triage = Triage(min_score=0.65, accept_threshold=0.85, verify_threshold=0.70, gap=0.20)
result = triage.classify(candidates)
# result.accepted, result.needs_review, result.rejected

QueryGraph — rephrase chain walk

from arf import follow_rephrase_chain

result = follow_rephrase_chain("due process clause", lookup_fn=my_db_lookup, max_hops=3)
# result.hit, result.cached_results, result.path, result.loop_detected

Walks a directed graph of query→rephrase edges with loop detection and early exit on cache hit. Storage-agnostic — you provide the lookup_fn.

ScoreParser — LLM output parsing + multiplier blending

from arf import extract_score, multiplier, adjust_score

extract_score('{"score": 7}')           # → 7
extract_score("Score: 8")               # → 8
multiplier(8)                           # → 1.39
adjust_score(0.72, "Score: 8")          # → min(0.72 * 1.39, 1.0)

Parses messy LLM output (JSON, bare numbers, "Score: N" lines) into a 0-9 score, converts to a multiplier, and blends with the retrieval score.

Trainer — MLP training

from arf.trainer import train_reranker, load_reranker

# Train
metrics = train_reranker(X, y, architecture=(64, 32, 16), save_path="model.joblib")

# Load as a predict_fn for Pipeline
predict_fn = load_reranker("model.joblib")

Requires pip install advanced-rag-framework[ml] (numpy + scikit-learn).

Ingest — document ingestion helper

from arf import ingest_documents, DocumentConfig

result = ingest_documents(
    documents,
    config=DocumentConfig(title_field="title", text_fields=["text"]),
    embed_fn=my_embed,     # your embedding function
    store_fn=my_store,     # your DB write function
)
# result.processed, result.skipped, result.errors

Validates documents, computes hierarchy metadata (depth, path), generates embeddings for parent and children, and stores via your function.

Bring Your Own Everything

Slot What you provide Examples
search_fn Vector search FAISS, Pinecone, Weaviate, Qdrant, MongoDB Atlas, pgvector
embed_fn Embeddings OpenAI, Voyage AI, Cohere, sentence-transformers, Ollama
predict_fn ML model scikit-learn, XGBoost, PyTorch, any callable
llm_fn LLM verification OpenAI, Anthropic, Ollama, Llama.cpp, any API
cache_lookup/store Cache Redis, MongoDB, SQLite, in-memory dict
resolve_fn Parent lookup Any database query
summarize_fn Answer generation Any LLM
store_fn (ingest) Document storage Any database write

Installation

# Core (zero dependencies)
pip install advanced-rag-framework

# With MLP training support (numpy + scikit-learn)
pip install advanced-rag-framework[ml]

Sample Project

See sample-project/ for a complete working example using:

  • FAISS for vector search
  • Voyage AI for embeddings
  • OpenAI for LLM verification
  • A cooking recipe dataset (non-legal, 46 recipes from 15 cuisines)
python sample-project/ingest.py                          # Embed recipes into FAISS
python sample-project/train.py                           # Train MLP reranker
python sample-project/query.py "spicy noodle soup"       # Full pipeline query

R-Flow Pipeline

The core innovation — each stage filters candidates so the next stage does less work:

                    ┌──────────────────────┐
                    │   Vector Search      │
                    │  (your provider)     │
                    └──────────┬───────────┘
                               │ candidates with scores
                    ┌──────────▼───────────┐
                    │  Threshold + Gap     │
                    │  Filter (~60% cut)   │
                    └──────────┬───────────┘
                               │ survivors
                    ┌──────────▼───────────┐
                    │  Feature Extraction  │
                    │  (15 features)       │
                    └──────────┬───────────┘
                               │ feature vectors
                    ┌──────────▼───────────┐
                    │   MLP Reranker       │
                    │  (<5ms, $0.00)       │
                    └──────────┬───────────┘
                        ┌──────┼──────┐
                   p≥0.6│  0.4<p<0.6  │p≤0.4
                        │      │      │
                   Accept   ┌──▼──┐  Reject
                   (free)   │ LLM │  (free)
                            │(20%)│
                            └──┬──┘
                          Accept/Reject

Development

git clone https://github.com/jager47X/ARF.git
cd ARF
pip install -e ".[dev]"

# Run library tests
pytest tests/test_arf/ -v

# Lint
ruff check arf/ tests/test_arf/

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with tests
  4. Submit a pull request

License

MIT License — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

advanced_rag_framework-0.2.1.tar.gz (143.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

advanced_rag_framework-0.2.1-py3-none-any.whl (138.7 kB view details)

Uploaded Python 3

File details

Details for the file advanced_rag_framework-0.2.1.tar.gz.

File metadata

  • Download URL: advanced_rag_framework-0.2.1.tar.gz
  • Upload date:
  • Size: 143.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for advanced_rag_framework-0.2.1.tar.gz
Algorithm Hash digest
SHA256 5b489c32138798b4af38792940d20f8602b1e7343423d478b731d98fcbe24cff
MD5 4e64334e287896508cbe5c529e10d69f
BLAKE2b-256 e8d6226a5731a12dd30e13143c32e05c6a40d2b0c7fdbb45e3ad62ab40ae4f65

See more details on using hashes here.

File details

Details for the file advanced_rag_framework-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for advanced_rag_framework-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1b2eb0997a28baf8ba5484b73931919780ccddb7e2568286e112a2f3d735e86a
MD5 b26da73a665e4c174159fed8f93e0d77
BLAKE2b-256 f92e9c878b6260949e7cc2f60378041287059f201e71016f88407a1d65f3a72f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page