Skip to main content

nyrag

Project description

NyRAG

NyRAG (pronounced as knee-RAG) is a simple tool for building RAG applications by crawling websites or processing documents, then deploying to Vespa for hybrid search with an integrated chat UI.

NyRAG Chat UI

How It Works

When a user asks a question, NyRAG performs a multi-stage retrieval process:

  1. Query Enhancement: An LLM generates additional search queries based on the user's question and initial context to improve retrieval coverage
  2. Embedding Generation: Each query is converted to embeddings using the configured SentenceTransformer model
  3. Vespa Search: Queries are executed against Vespa using nearestNeighbor search with the best_chunk_score ranking profile to find the most relevant document chunks
  4. Chunk Fusion: Results from all queries are aggregated, deduplicated, and ranked by score to select the top-k most relevant chunks
  5. Answer Generation: The retrieved context is sent to an LLM which generates a grounded answer based only on the provided chunks

This multi-query RAG approach with chunk-level retrieval ensures answers are comprehensive and grounded in your actual content, whether from crawled websites or processed documents.

LLM Support

NyRAG works with any OpenAI-compatible API, including:

  • OpenRouter (100+ models from various providers)
  • Ollama (local models: Llama, Mistral, Qwen, etc.)
  • LM Studio (local GUI for running models)
  • vLLM (high-performance local or remote inference)
  • LocalAI (local OpenAI drop-in replacement)
  • OpenAI (GPT-4, GPT-3.5, etc.)
  • Any other service implementing the OpenAI API format

Installation

pip install nyrag

We recommend uv:

uv init --python 3.10
uv venv
uv sync
source .venv/bin/activate
uv pip install -U nyrag

For development:

git clone https://github.com/abhishekkrthakur/nyrag.git
cd nyrag
pip install -e .

Usage

NyRAG is designed to be used primarily through its web UI, which manages the entire lifecycle from data processing to chat.

1. Start the UI

Local Mode (requires Docker):

nyrag ui

Cloud Mode (requires Vespa Cloud account):

nyrag ui --cloud

Open http://localhost:8000 in your browser.

2. Configure & Process

In the UI, you can create a new configuration for your data source.

Example Web Crawl Config:

name: mywebsite
mode: web
start_loc: https://example.com/
crawl_params:
  respect_robots_txt: true
rag_params:
  embedding_model: sentence-transformers/all-MiniLM-L6-v2

Example Docs Processing Config:

name: mydocs
mode: docs
start_loc: /path/to/documents/
doc_params:
  recursive: true
rag_params:
  embedding_model: sentence-transformers/all-mpnet-base-v2

3. Chat

Once processing is complete, you can start chatting with your data immediately in the UI. Make sure your configuration includes your LLM API key and model selection.

Configuration Reference

Cloud Deploy Parameters

Parameter Type Default Description
cloud_tenant str None Vespa Cloud tenant (required for cloud mode if no env/CLI target)

Connection Parameters

Parameter Type Default Description
vespa_url str None Vespa endpoint URL (auto-filled into conf.yml after deploy)
vespa_port int None Vespa endpoint port (auto-filled into conf.yml after deploy)

Web Mode Parameters (crawl_params)

Parameter Type Default Description
respect_robots_txt bool true Respect robots.txt rules
aggressive_crawl bool false Faster crawling with more concurrent requests
follow_subdomains bool true Follow links to subdomains
strict_mode bool false Only crawl URLs matching start pattern
user_agent_type str chrome chrome, firefox, safari, mobile, bot
custom_user_agent str None Custom user agent string
allowed_domains list None Explicitly allowed domains

Docs Mode Parameters (doc_params)

Parameter Type Default Description
recursive bool true Process subdirectories
include_hidden bool false Include hidden files
follow_symlinks bool false Follow symbolic links
max_file_size_mb float None Max file size in MB
file_extensions list None Only process these extensions

RAG Parameters (rag_params)

Parameter Type Default Description
embedding_model str sentence-transformers/all-MiniLM-L6-v2 Embedding model
embedding_dim int 384 Embedding dimension
chunk_size int 1024 Chunk size for text splitting
chunk_overlap int 50 Overlap between chunks
distance_metric str angular Distance metric
max_tokens int 8192 Max tokens per document
llm_base_url str None LLM API base URL (OpenAI-compatible)
llm_model str None LLM model name
llm_api_key str None LLM API key

LLM Provider Support

NyRAG works with any OpenAI-compatible API. Just configure the rag_params in your UI settings.

Provider Base URL Model Example API Key
Ollama http://localhost:11434/v1 llama3.2 dummy
LM Studio http://localhost:1234/v1 local-model dummy
vLLM http://localhost:8000/v1 meta-llama/Llama-3.2-3B-Instruct dummy
OpenRouter https://openrouter.ai/api/v1 openai/gpt-5.2 your-key
OpenAI None (default) openai/gpt-4o your-key

Example Config:

llm_config:
  llm_base_url: https://openrouter.ai/api/v1
  llm_model: llama3.2
  llm_api_key: dummy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nyrag-0.0.9.tar.gz (117.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nyrag-0.0.9-py3-none-any.whl (125.5 kB view details)

Uploaded Python 3

File details

Details for the file nyrag-0.0.9.tar.gz.

File metadata

  • Download URL: nyrag-0.0.9.tar.gz
  • Upload date:
  • Size: 117.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for nyrag-0.0.9.tar.gz
Algorithm Hash digest
SHA256 f99f6aab48261e24da16324515c46e0c34a34cccc03edea4c4eff4d877746ef2
MD5 d55cc31a39680195dcbac6c4593422b7
BLAKE2b-256 7764b3e9dca34f2fe7302de5bd5d15f1290da447a5282cbc165ab4bb035809d4

See more details on using hashes here.

File details

Details for the file nyrag-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: nyrag-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 125.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for nyrag-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 799c0c0dc2323b82e3fcdba8c6f695f0f38c7378ce71d38522010f5a1d0bbfd0
MD5 6cb3ae4431a20b47a7052f8dfabbdbf6
BLAKE2b-256 60461135c5d4b020ee76124bb21f9e9da9dcd258e517175233ede098366c8144

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page