No project description provided

Project description

NVIDIA Context Aware RAG

Context Aware RAG is a flexible library designed to seamlessly integrate into existing data processing workflows to build customized data ingestion and retrieval (RAG) pipelines.

Key Features

Data Ingestion Service: Add data to the RAG pipeline from a variety of sources.
Data Retrieval Service: Retrieve data from the RAG pipeline using natural language queries.
Function and Tool Components: Easy to create custom functions and tools to support your existing workflows.
GraphRAG: Seamlessly extract knowledge graphs from data to support your existing workflows.
Observability: Monitor and troubleshoot your workflows with any OpenTelemetry-compatible monitoring tool.
Experimental Features: CA-RAG also provides structured output mode for response and five important Model Context Protocol (MCP) tools for using CA-RAG with AI agentic workflows.

With Context Aware RAG, you can quickly build RAG pipelines to support your existing workflows.

Getting Started

Prerequisites

Before you begin using Context Aware RAG, ensure that you have the following software installed.

Install Git
Install uv

Installation

Clone the repository

git clone git@github.com:NVIDIA/context-aware-rag.git
cd context-aware-rag/

Create a virtual environment using uv

uv venv --seed .venv
source .venv/bin/activate

Installing from source

uv pip install -e .

Installing optional plugins

uv pip install -e .[arango]

Optional: Building and Installing the wheel file

uv build
uv pip install vss_ctx_rag-1.0.0-py3-none-any.whl

Service Example

Setting up environment variables

Create a .env file in the root directory and set the following variables:

   NVIDIA_API_KEY=<IF USING NVIDIA>
   NVIDIA_VISIBLE_DEVICES=<GPU ID>

   OPENAI_API_KEY=<IF USING OPENAI>

   VSS_CTX_PORT_RET=<DATA RETRIEVAL PORT>
   VSS_CTX_PORT_IN=<DATA INGESTION PORT>

   GRAPH_DB_USERNAME=<GRAPH_DB_USERNAME>
   GRAPH_DB_PASSWORD=<GRAPH_DB_PASSWORD>
   ARANGO_DB_USERNAME=root
   ARANGO_DB_PASSWORD=<ARANGO_DB_PASSWORD>
   MINIO_USERNAME=<MINIO_USERNAME>
   MINIO_PASSWORD=<MINIO_PASSWORD>

Build docker

make -C docker build

Using docker compose

make -C docker start_compose

This will start the following services:

ctx-rag-data-ingestion
- Service available at http://<HOST>:<VSS_CTX_PORT_IN>
ctx-rag-data-retrieval
- Service available at http://<HOST>:<VSS_CTX_PORT_RET>
neo4j
- UI available at http://<HOST>:7474
milvus
otel-collector
Phoenix
- UI available at http://<HOST>:16686
prometheus
- UI available at http://<HOST>:9090

To change the storage volumes, export DOCKER_VOLUME_DIRECTORY to the desired directory.

Data Ingestion Example

import requests
import json
from pyaml_env import parse_config

base_url = "http://<HOST>:<VSS_CTX_PORT_IN>"

headers = {"Content-Type": "application/json"}

### Initialize the service with a unique uuid
init_data = {"uuid": "1"}
### Optional: Initialize the service with a config file or context config
"""
init_data = {"config_path": "/app/config/config.yaml", "uuid": "1"}
init_data = {"context_config": parse_config("/app/config/config.yaml"), "uuid": "1"}
"""
response = requests.post(
    f"{base_url}/init", headers=headers, data=json.dumps(init_data)
)

# POST request to /add_doc to add documents to the service
add_doc_data_list = [
    {
        "document": "User1: Hi how are you?",
        "doc_index": 0,
        "doc_metadata": {
            "streamId": "stream1",
            "chunkIdx": 0,
            "file": "chat_conversation.txt",
            "is_first": True,
            "is_last": False,
            "uuid": "1"
        },
        "uuid": "1"
    },
    {
        "document": "User2: I am good. How are you?",
        "doc_index": 1,
        "doc_metadata": {
            "streamId": "stream1",
            "chunkIdx": 1,
            "file": "chat_conversation.txt",
            "uuid": "1"
        },
        "uuid": "1"
    },
    {
        "document": "User1: I am great too. Thanks for asking",
        "doc_index": 2,
        "doc_metadata": {
            "streamId": "stream1",
            "chunkIdx": 2,
            "file": "chat_conversation.txt",
            "uuid": "1"
        },
        "uuid": "1"
    },
    {
        "document": "User2: So what did you do over the weekend?",
        "doc_index": 3,
        "doc_metadata": {
            "streamId": "stream1",
            "chunkIdx": 3,
            "file": "chat_conversation.txt",
            "uuid": "1"
        },
        "uuid": "1"
    },
    {
        "document": "User1: I went hiking to Mission Peak",
        "doc_index": 4,
        "doc_metadata": {
            "streamId": "stream1",
            "chunkIdx": 4,
            "file": "chat_conversation.txt",
            "uuid": "1"
        },
        "uuid": "1"
    },
    {
        "document": "User3: Guys there is a fire. Let us get out of here",
        "doc_index": 5,
        "doc_metadata": {
            "streamId": "stream1",
            "chunkIdx": 5,
            "file": "chat_conversation.txt",
            "is_first": False,
            "is_last": True,
            "uuid": "1"
        },
        "uuid": "1"
    },
]

# Send POST requests for each document
for add_doc_data in add_doc_data_list:
    response = requests.post(
        f"{base_url}/add_doc", headers=headers, data=json.dumps(add_doc_data)
    )
    print(response.text)

response = requests.post(
    f"{base_url}/complete_ingestion", headers=headers, data=json.dumps({"uuid": "1"})
)
print(response.text)

Data Retrieval Example

import requests
import json


base_url = "http://<HOST>:<VSS_CTX_PORT_RET>"

headers = {"Content-Type": "application/json"}

init_data = {"config_path": "/app/config/config.yaml", "uuid": "1"}
response = requests.post(
    f"{base_url}/init", headers=headers, data=json.dumps(init_data)
)

chat_data = {
    "model": "meta/llama-3.1-70b-instruct",
    "base_url": "https://integrate.api.nvidia.com/v1",
    "messages": [{"role": "user", "content": "Who mentioned the fire?"}],
    "uuid": "1"
}

response = requests.post(f"{base_url}/chat/completions", headers=headers, data=json.dumps(chat_data))
print(response.json()["choices"][0]["message"]["content"])

Summary Data Retrieval Example

Summary data retrieval can be made to the system using the /summary endpoint of the Retrieval Service.

Example Query

import requests

url = "http://<HOST>:<VSS_CTX_PORT_RET>/summary"
headers = {"Content-Type": "application/json"}
data = {
    "uuid": "1",
    "summarization": {
        "start_index": 0,
        "end_index": -1
    }
}

response = requests.post(url, headers=headers, json=data)
print(response.json()["result"])

Acknowledgements

We would like to thank the following projects that made Context Aware RAG possible:

Project details

Release history Release notifications | RSS feed

1.0.2

Jan 28, 2026

1.0.1

Oct 14, 2025

This version

1.0.0

Sep 25, 2025

0.5.1

Jul 14, 2025

0.5.0

Jul 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vss_ctx_rag-1.0.0-py3-none-any.whl (651.4 kB view details)

Uploaded Sep 25, 2025 Python 3

File details

Details for the file vss_ctx_rag-1.0.0-py3-none-any.whl.

File metadata

Download URL: vss_ctx_rag-1.0.0-py3-none-any.whl
Upload date: Sep 25, 2025
Size: 651.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.10.18

File hashes

Hashes for vss_ctx_rag-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`06deb89827486903162a4bd46617dbe0dba450b361dcf9739ad6829e8c995564`
MD5	`61035bd0e6c4a759fd33ae37229dbfa5`
BLAKE2b-256	`9b1066ed1134c8b538261e2d5d14dba66bec3add7caadfc35821a2c5518705a2`

See more details on using hashes here.

vss-ctx-rag 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta