Skip to main content

Framework-agnostic RAG pipeline SDK. Plug in any component, swap any stage, configure everything in YAML

Project description

NexRAG

███╗   ██╗███████╗██╗  ██╗██████╗  █████╗  ██████╗
████╗  ██║██╔════╝╚██╗██╔╝██╔══██╗██╔══██╗██╔════╝
██╔██╗ ██║█████╗   ╚███╔╝ ██████╔╝███████║██║  ███╗
██║╚██╗██║██╔══╝   ██╔██╗ ██╔══██╗██╔══██║██║   ██║
██║ ╚████║███████╗██╔╝ ██╗██║  ██║██║  ██║╚██████╔╝
╚═╝  ╚═══╝╚══════╝╚═╝  ╚═╝╚═╝  ╚═╝╚═╝  ╚═╝ ╚═════╝

●plug ⇄swap ▶scale

Framework-agnostic RAG pipeline SDK. Plug in any component, swap any stage, configure everything in YAML.

PyPI version Python 3.12+ License


What is NexRAG?

NexRAG is a production-grade RAG (Retrieval-Augmented Generation) pipeline SDK for Python.

NexRAG owns the pipeline shape. You own the components.

Every stage — loading, chunking, embedding, retrieval, generation — is a clean interface. NexRAG ships default implementations for each. You can swap any of them by implementing the interface and declaring it in YAML. No framework lock-in. No magic. No hidden behavior.


Quickstart

pip install "nexrag[openai,chromadb,pdf]"
export OPENAI_API_KEY=sk-...
cp nexrag.example.yaml nexrag.yaml   # edit to taste
from nexrag import NexRAG, RunMetrics

pipeline = NexRAG.from_config("nexrag.yaml")

# Ingest a PDF
result = pipeline.ingest("contracts/agreement.pdf")
print(f"Ingested {result.documents_loaded} doc, {result.chunks_produced} chunks")

# Query (blocking)
result = pipeline.query("What are the termination clauses?")
print(result.answer)
for source in result.sources:
    print(f"  [{source.rank}] score={source.score:.3f}  {source.chunk.metadata.get('source')}")

# Streaming — tokens arrive live; RunMetrics is the final item
metrics = None
for item in pipeline.stream_query("Summarise the key obligations."):
    if isinstance(item, RunMetrics):
        metrics = item
    else:
        print(item, end="", flush=True)
print(f"\n\n{metrics.total_latency_ms:.0f}ms — {metrics.chunks_retrieved} chunks retrieved")
# nexrag.yaml (minimal)
ingestion:
  loader:
    type: pdf
  embedder:
    provider: openai
    model: text-embedding-3-small
    api_key: ${OPENAI_API_KEY}
  vector_db:
    provider: chroma
    default_collection: documents
    collections:
      documents:
        mode: persistent
        path: ./.nexrag/chroma

query:
  embedder: inherit
  llm:
    provider: openai
    model: gpt-4o
    api_key: ${OPENAI_API_KEY}

See docs/ for the full documentation site.


Installation

# Core only — pydantic + pyyaml. No provider SDKs; add the extras you use below.
pip install nexrag

# Default getting-started stack (OpenAI embedder/LLM + ChromaDB vector store)
pip install "nexrag[openai,chromadb]"

# Provider extras — install only what you use
pip install "nexrag[openai]"         # OpenAI embedder + LLM
pip install "nexrag[chromadb]"       # ChromaDB vector store
pip install "nexrag[anthropic]"      # Anthropic (Claude) LLM
pip install "nexrag[ollama]"         # Ollama local LLM + embedder
pip install "nexrag[huggingface]"    # HuggingFace embedder

# Document loaders
pip install "nexrag[pdf]"            # PDFLoader (pypdf)
pip install "nexrag[word]"           # Word documents (python-docx)
pip install "nexrag[html]"           # HTML pages (beautifulsoup4)

# Retrieval extras
pip install "nexrag[bm25]"           # BM25Retriever keyword search (rank-bm25)
pip install "nexrag[cohere]"         # CohereReranker
pip install "nexrag[cross-encoder]"  # CrossEncoderReranker (sentence-transformers)

# Convenience bundles
pip install "nexrag[all-sparse]"     # all sparse retrievers (bm25)
pip install "nexrag[all-rerankers]"  # all rerankers (cohere + cross-encoder)
pip install "nexrag[all-retrieval]"  # all-sparse + all-rerankers
pip install "nexrag[all-loaders]"    # all document loaders
pip install "nexrag[all-providers]"  # all LLM + embedder providers

# Full bundle — dev/CI; pulls PyTorch via sentence-transformers
pip install "nexrag[all]"

Design Principles

Principle What it means
Interface-first Every stage is a contract. Implementation is secondary.
Config-driven YAML configures the pipeline. Code defines the logic.
Zero lock-in Core has no dependency on LangChain, LlamaIndex, or any AI SDK.
Explicit over implicit No hidden defaults. Every behavior is declared or documented.
Extensible by design New components plug in without touching core.

Architecture

NexRAG has two independent pipelines:

INGESTION  →  Loader → Sanitizer → Chunker → Embedder → VectorDB
QUERY      →  Embedder → Retriever → PromptBuilder → LLM → PipelineResult

See Architecture Documentation for full pipeline diagrams.


Supported Providers

Category Providers
Embedders OpenAI, Ollama, HuggingFace
Vector DBs ChromaDB (in-memory, persistent, remote server)
LLMs OpenAI, Ollama, Anthropic
Loaders PDF, plain text, Word, HTML, Excel
Chunkers Recursive (separator-aware)
Retrievers Dense (cosine similarity), BM25 (keyword), Hybrid (dense + BM25)
Rerankers Cohere, CrossEncoder (sentence-transformers)

Contributing

NexRAG is in early development. Contribution guidelines will be published with v1.0.


Changelog

See CHANGELOG.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nexrag-0.4.0.tar.gz (98.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nexrag-0.4.0-py3-none-any.whl (145.4 kB view details)

Uploaded Python 3

File details

Details for the file nexrag-0.4.0.tar.gz.

File metadata

  • Download URL: nexrag-0.4.0.tar.gz
  • Upload date:
  • Size: 98.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for nexrag-0.4.0.tar.gz
Algorithm Hash digest
SHA256 338d38fb873c43892fb1d868ea004b7b535171125e9ea8ac990b84a468a3dcb8
MD5 a9363a5115fa19381f3ae3be4e5aed53
BLAKE2b-256 0364ec2cb13ef1c42b1f54e5d801a2033e90c2c8e925e4711250ac91097ca87b

See more details on using hashes here.

File details

Details for the file nexrag-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: nexrag-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 145.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for nexrag-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ef14e520e88f17a52ef4f5f6fedeb036f9aa15f5963eda1febeb5be0fa98ae43
MD5 33a887fa256f5c689c5df38b795dc6ff
BLAKE2b-256 91a8a02724efe086367c482a71f5f10ca1c4c297322cf3173becb4270986c3f4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page