Skip to main content

Framework-agnostic RAG pipeline SDK. Plug in any component, swap any stage, configure everything in YAML

Project description

NexRAG

███╗   ██╗███████╗██╗  ██╗██████╗  █████╗  ██████╗
████╗  ██║██╔════╝╚██╗██╔╝██╔══██╗██╔══██╗██╔════╝
██╔██╗ ██║█████╗   ╚███╔╝ ██████╔╝███████║██║  ███╗
██║╚██╗██║██╔══╝   ██╔██╗ ██╔══██╗██╔══██║██║   ██║
██║ ╚████║███████╗██╔╝ ██╗██║  ██║██║  ██║╚██████╔╝
╚═╝  ╚═══╝╚══════╝╚═╝  ╚═╝╚═╝  ╚═╝╚═╝  ╚═╝ ╚═════╝

●plug ⇄swap ▶scale

Framework-agnostic RAG pipeline SDK. Plug in any component, swap any stage, configure everything in YAML.

PyPI version Python 3.12+ License


What is NexRAG?

NexRAG is a production-grade RAG (Retrieval-Augmented Generation) pipeline SDK for Python.

NexRAG owns the pipeline shape. You own the components.

Every stage — loading, chunking, embedding, retrieval, generation — is a clean interface. NexRAG ships default implementations for each. You can swap any of them by implementing the interface and declaring it in YAML. No framework lock-in. No magic. No hidden behavior.


Quickstart

pip install "nexrag[openai,chromadb,pdf]"
export OPENAI_API_KEY=sk-...
cp nexrag.example.yaml nexrag.yaml   # edit to taste
from nexrag import NexRAG, RunMetrics

pipeline = NexRAG.from_config("nexrag.yaml")

# Ingest a PDF
result = pipeline.ingest("contracts/agreement.pdf")
print(f"Ingested {result.documents_loaded} doc, {result.chunks_produced} chunks")

# Query (blocking)
result = pipeline.query("What are the termination clauses?")
print(result.answer)
for source in result.sources:
    print(f"  [{source.rank}] score={source.score:.3f}  {source.chunk.metadata.get('source')}")

# Streaming — tokens arrive live; RunMetrics is the final item
metrics = None
for item in pipeline.stream_query("Summarise the key obligations."):
    if isinstance(item, RunMetrics):
        metrics = item
    else:
        print(item, end="", flush=True)
print(f"\n\n{metrics.total_latency_ms:.0f}ms — {metrics.chunks_retrieved} chunks retrieved")
# nexrag.yaml (minimal)
ingestion:
  loader:
    type: pdf
  embedder:
    provider: openai
    model: text-embedding-3-small
    api_key: ${OPENAI_API_KEY}
  vector_db:
    provider: chroma
    default_collection: documents
    collections:
      documents:
        mode: persistent
        path: ./.nexrag/chroma

query:
  embedder: inherit
  llm:
    provider: openai
    model: gpt-4o
    api_key: ${OPENAI_API_KEY}

See docs/ for the full documentation site.


Installation

# Core only — pydantic + pyyaml. No provider SDKs; add the extras you use below.
pip install nexrag

# Default getting-started stack (OpenAI embedder/LLM + ChromaDB vector store)
pip install "nexrag[openai,chromadb]"

# Provider extras — install only what you use
pip install "nexrag[openai]"         # OpenAI embedder + LLM
pip install "nexrag[chromadb]"       # ChromaDB vector store
pip install "nexrag[anthropic]"      # Anthropic (Claude) LLM
pip install "nexrag[ollama]"         # Ollama local LLM + embedder
pip install "nexrag[huggingface]"    # HuggingFace embedder

# Document loaders
pip install "nexrag[pdf]"            # PDFLoader (pypdf)
pip install "nexrag[word]"           # Word documents (python-docx)
pip install "nexrag[html]"           # HTML pages (beautifulsoup4)

# Retrieval extras
pip install "nexrag[bm25]"           # BM25Retriever keyword search (rank-bm25)
pip install "nexrag[cohere]"         # CohereReranker
pip install "nexrag[cross-encoder]"  # CrossEncoderReranker (sentence-transformers)

# Convenience bundles
pip install "nexrag[all-sparse]"     # all sparse retrievers (bm25)
pip install "nexrag[all-rerankers]"  # all rerankers (cohere + cross-encoder)
pip install "nexrag[all-retrieval]"  # all-sparse + all-rerankers
pip install "nexrag[all-loaders]"    # all document loaders
pip install "nexrag[all-providers]"  # all LLM + embedder providers

# Full bundle — dev/CI; pulls PyTorch via sentence-transformers
pip install "nexrag[all]"

Design Principles

Principle What it means
Interface-first Every stage is a contract. Implementation is secondary.
Config-driven YAML configures the pipeline. Code defines the logic.
Zero lock-in Core has no dependency on LangChain, LlamaIndex, or any AI SDK.
Explicit over implicit No hidden defaults. Every behavior is declared or documented.
Extensible by design New components plug in without touching core.

Architecture

NexRAG has two independent pipelines:

INGESTION  →  Loader → Sanitizer → Chunker → Embedder → VectorDB
QUERY      →  Embedder → Retriever → PromptBuilder → LLM → PipelineResult

See Architecture Documentation for full pipeline diagrams.


Supported Providers

Category Providers
Embedders OpenAI, Ollama, HuggingFace
Vector DBs ChromaDB (in-memory, persistent, remote server)
LLMs OpenAI, Ollama, Anthropic
Loaders PDF, plain text, Word, HTML, Excel
Chunkers Recursive (separator-aware)
Retrievers Dense (cosine similarity), BM25 (keyword), Hybrid (dense + BM25)
Rerankers Cohere, CrossEncoder (sentence-transformers)

Contributing

NexRAG is in early development. Contribution guidelines will be published with v1.0.


Changelog

See CHANGELOG.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nexrag-0.3.3.tar.gz (72.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nexrag-0.3.3-py3-none-any.whl (105.3 kB view details)

Uploaded Python 3

File details

Details for the file nexrag-0.3.3.tar.gz.

File metadata

  • Download URL: nexrag-0.3.3.tar.gz
  • Upload date:
  • Size: 72.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for nexrag-0.3.3.tar.gz
Algorithm Hash digest
SHA256 ef5ce1f214ac63e23967e9d0213c76cc0370fea5e8f2a7dfb74bf097d772b950
MD5 b3b58f8806ba887b3c5d458a215b8948
BLAKE2b-256 8440ec9b668c8e260fe487a0addad366077bbaa1c5dd457a248b24562184b50e

See more details on using hashes here.

File details

Details for the file nexrag-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: nexrag-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 105.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for nexrag-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 da1d1609b2a71e876d6c5a160e031f4eaf5a6f5a87f25114332eaa9f32d379f7
MD5 2571b710e58bfd21ebeb54f7a3829889
BLAKE2b-256 ce6fb588125b9c1aee65478172863c7cf25c91d69d34d10226e4100d8d7b0852

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page