Skip to main content

Framework-agnostic RAG pipeline SDK. Plug in any component, swap any stage, configure everything in YAML

Project description

NexRAG

███╗   ██╗███████╗██╗  ██╗██████╗  █████╗  ██████╗
████╗  ██║██╔════╝╚██╗██╔╝██╔══██╗██╔══██╗██╔════╝
██╔██╗ ██║█████╗   ╚███╔╝ ██████╔╝███████║██║  ███╗
██║╚██╗██║██╔══╝   ██╔██╗ ██╔══██╗██╔══██║██║   ██║
██║ ╚████║███████╗██╔╝ ██╗██║  ██║██║  ██║╚██████╔╝
╚═╝  ╚═══╝╚══════╝╚═╝  ╚═╝╚═╝  ╚═╝╚═╝  ╚═╝ ╚═════╝

●plug ⇄swap ▶scale

Framework-agnostic RAG pipeline SDK. Plug in any component, swap any stage, configure everything in YAML.

PyPI version Python 3.12+ License


What is NexRAG?

NexRAG is a production-grade RAG (Retrieval-Augmented Generation) pipeline SDK for Python.

NexRAG owns the pipeline shape. You own the components.

Every stage — loading, chunking, embedding, retrieval, generation — is a clean interface. NexRAG ships default implementations for each. You can swap any of them by implementing the interface and declaring it in YAML. No framework lock-in. No magic. No hidden behavior.


Quickstart

pip install "nexrag[openai,pdf]"
export OPENAI_API_KEY=sk-...
cp nexrag.example.yaml nexrag.yaml   # edit to taste
from nexrag import NexRAG, RunMetrics

pipeline = NexRAG.from_config("nexrag.yaml")

# Ingest a PDF
result = pipeline.ingest("contracts/agreement.pdf")
print(f"Ingested {result.documents_loaded} doc, {result.chunks_produced} chunks")

# Query (blocking)
result = pipeline.query("What are the termination clauses?")
print(result.answer)
for source in result.sources:
    print(f"  [{source.rank}] score={source.score:.3f}  {source.chunk.metadata.get('source')}")

# Streaming — tokens arrive live; RunMetrics is the final item
metrics = None
for item in pipeline.stream_query("Summarise the key obligations."):
    if isinstance(item, RunMetrics):
        metrics = item
    else:
        print(item, end="", flush=True)
print(f"\n\n{metrics.total_latency_ms:.0f}ms — {metrics.chunks_retrieved} chunks retrieved")
# nexrag.yaml (minimal)
ingestion:
  loader:
    type: pdf
  embedder:
    provider: openai
    model: text-embedding-3-small
    api_key: ${OPENAI_API_KEY}
  vector_db:
    provider: chroma
    default_collection: documents
    collections:
      documents:
        mode: persistent
        path: ./.nexrag/chroma

query:
  embedder: inherit
  llm:
    provider: openai
    model: gpt-4o
    api_key: ${OPENAI_API_KEY}

See docs/ for the full documentation site.


Installation

# Core only
pip install nexrag

# Provider extras — install only what you use
pip install "nexrag[openai]"         # OpenAI embedder + LLM
pip install "nexrag[anthropic]"      # Anthropic (Claude) LLM
pip install "nexrag[ollama]"         # Ollama local LLM + embedder
pip install "nexrag[huggingface]"    # HuggingFace embedder

# Document loaders
pip install "nexrag[pdf]"            # PDFLoader (pypdf)
pip install "nexrag[word]"           # Word documents (python-docx)
pip install "nexrag[html]"           # HTML pages (beautifulsoup4)

# Retrieval extras
pip install "nexrag[bm25]"           # BM25Retriever keyword search (rank-bm25)
pip install "nexrag[cohere]"         # CohereReranker
pip install "nexrag[cross-encoder]"  # CrossEncoderReranker (sentence-transformers)

# Convenience bundles
pip install "nexrag[all-sparse]"     # all sparse retrievers (bm25)
pip install "nexrag[all-rerankers]"  # all rerankers (cohere + cross-encoder)
pip install "nexrag[all-retrieval]"  # all-sparse + all-rerankers
pip install "nexrag[all-loaders]"    # all document loaders
pip install "nexrag[all-providers]"  # all LLM + embedder providers

# Full bundle — dev/CI; pulls PyTorch via sentence-transformers
pip install "nexrag[all]"

Design Principles

Principle What it means
Interface-first Every stage is a contract. Implementation is secondary.
Config-driven YAML configures the pipeline. Code defines the logic.
Zero lock-in Core has no dependency on LangChain, LlamaIndex, or any AI SDK.
Explicit over implicit No hidden defaults. Every behavior is declared or documented.
Extensible by design New components plug in without touching core.

Architecture

NexRAG has two independent pipelines:

INGESTION  →  Loader → Sanitizer → Chunker → Embedder → VectorDB
QUERY      →  Embedder → Retriever → PromptBuilder → LLM → PipelineResult

See Architecture Documentation for full pipeline diagrams.


Supported Providers

Category Providers
Embedders OpenAI, Ollama, HuggingFace
Vector DBs ChromaDB (in-memory, persistent, remote server)
LLMs OpenAI, Ollama, Anthropic
Loaders PDF, plain text, Word, HTML, Excel
Chunkers Recursive (separator-aware)
Retrievers Dense (cosine similarity), BM25 (keyword), Hybrid (dense + BM25)
Rerankers Cohere, CrossEncoder (sentence-transformers)

Contributing

NexRAG is in early development. Contribution guidelines will be published with v1.0.


Changelog

See CHANGELOG.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nexrag-0.3.1.tar.gz (64.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nexrag-0.3.1-py3-none-any.whl (96.7 kB view details)

Uploaded Python 3

File details

Details for the file nexrag-0.3.1.tar.gz.

File metadata

  • Download URL: nexrag-0.3.1.tar.gz
  • Upload date:
  • Size: 64.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for nexrag-0.3.1.tar.gz
Algorithm Hash digest
SHA256 2a54e71cf0c15fba1a341c4718e18c2f63b190c88e83d70f2ffc73f750e3f7e2
MD5 8c763770ace01c78be3620e458c06a03
BLAKE2b-256 4c247b1c340d027d98f272ed9a388522368a7ab2fd3f05c0e9e356675f2b9813

See more details on using hashes here.

File details

Details for the file nexrag-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: nexrag-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 96.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for nexrag-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 64893b5cad0a89324b3a9a94182a49a27debbf4bf105ddcfc3cce72c4981f126
MD5 9972f55daa538d7ae26dd899b72bdfe1
BLAKE2b-256 4dc239c64dbca116aff3dadbed669497a4a4ffe34e5fedab73e85c21db7dd1ae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page