A Python library for document-based RAG

Project description

rag-kit

rag-kit is a simple, modular Python library for building PDF-based RAG applications with conversational memory and flexible LLM provider support.

It is designed to hide most of the LangChain complexity behind a clean API:

from ragkit import PDFRAG

rag = PDFRAG("data/sample.pdf")
print(rag.ask("What is LangChain?"))

Features

PDF-based RAG
Conversational chat with session memory
Follow-up handling for queries like:
- hindi m batao
- tell me in english
- what did I ask earlier?
Query rewriting for better retrieval
Source return support
Configurable chunking and retrieval
Multiple LLM provider support:
- Sarvam (default)
- OpenAI
- Anthropic / Claude
- Custom LangChain-compatible chat models

Installation

Basic install

pip install rag-kit

Optional provider extras

pip install "rag-kit[openai]"
pip install "rag-kit[anthropic]"
pip install "rag-kit[all]"

Local development install

pip install -e .

Environment Variables

Create a .env file in your project root:

SARVAM_API_KEY=
OPENAI_API_KEY=
ANTHROPIC_API_KEY=

An example template is provided in .env.example.

Quick Start

Stateless Q&A

from ragkit import PDFRAG

rag = PDFRAG("data/sample.pdf")
answer = rag.ask("What is memory?")
print(answer)

Chat with memory

from ragkit import PDFRAG

rag = PDFRAG("data/sample.pdf")

session_id = "user1"

print(rag.chat("What is memory?", session_id=session_id))
print(rag.chat("hindi m batao", session_id=session_id))
print(rag.chat("tell me in english", session_id=session_id))

Return sources

from ragkit import PDFRAG

rag = PDFRAG("data/sample.pdf")
result = rag.ask("What is memory?", return_sources=True)

print(result["answer"])
print(result["sources"])

Example shape:

{
    "answer": "Memory in LangChain stores previous conversation turns...",
    "sources": [
        {
            "content": "Memory in chat applications is created by storing earlier conversation turns...",
            "page": 2,
            "source": "data/sample.pdf",
            "metadata": {
                "page": 2,
                "source": "data/sample.pdf"
            }
        }
    ]
}

ask() vs chat()

Method	Purpose
`ask()`	Stateless document Q&A
`chat()`	History-aware conversational interaction

Use ask() when you want a direct answer from the document.

Use chat() when you want:

follow-up questions
translation of the previous answer
history-based conversation

LLM Providers

Default: Sarvam

from ragkit import PDFRAG

rag = PDFRAG("file.pdf")

OpenAI

from ragkit import PDFRAG

rag = PDFRAG(
    "file.pdf",
    llm_provider="openai",
    llm_config={
        "model": "gpt-4o-mini",
        "temperature": 0.1,
    },
)

Claude

from ragkit import PDFRAG

rag = PDFRAG(
    "file.pdf",
    llm_provider="claude",
    llm_config={
        "model": "claude-3-5-haiku-latest",
        "temperature": 0.2,
    },
)

Custom LLM

from langchain_openai import ChatOpenAI
from ragkit import PDFRAG

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
rag = PDFRAG("file.pdf", llm=llm)

Configuration

from ragkit import PDFRAG, RAGConfig

config = RAGConfig(
    chunk_size=800,
    chunk_overlap=150,
    top_k=5,
    use_multi_query=True,
    enable_query_rewrite=True,
)

rag = PDFRAG("file.pdf", config=config)

Configurable options currently include:

persist_directory
chunk_size
chunk_overlap
top_k
use_multi_query
enable_query_rewrite
collection_name
verbose
llm_provider
llm_model
llm_temperature
llm_kwargs

Add More Documents

rag.add_documents("data/another.pdf")

Reset Chat

rag.reset_chat("user1")

Project Structure

rag-kit/
├── .env.example
├── .gitignore
├── README.md
├── pyproject.toml
├── examples/
├── data/
├── src/
│   └── ragkit/
└── third_party/

Do You Need `requirements.txt`?

Not necessarily.

For modern Python packaging, pyproject.toml is enough and should be the main source of dependencies.

Use requirements.txt only if you want one of these:

easier local setup for teammates
pinned development environment
quick install for people who do not use packaging workflows

Recommendation

Keep:

pyproject.toml as the main dependency file

Optional:

requirements-dev.txt for local development and testing

Example requirements-dev.txt:

pytest
black
ruff
build
twine

If you want, you can also generate a plain requirements.txt, but it should not replace pyproject.toml.

Current Limitations

Primarily optimized for PDF-based RAG
Sarvam support may depend on vendored or local integration setup
No streaming support yet
No FastAPI server or UI layer yet
Agent support is planned, but not included in the current public API

Roadmap

Better source citations
Improved multi-file indexing isolation
Streaming responses
FastAPI server mode
Playground / UI
Agent support via ragkit.agent

Examples

Check the examples/ folder for runnable examples such as:

basic_ask.py
chat_example.py
provider_openai.py

License

MIT License

Version

Current version: 0.1.0-beta

APIs may evolve in future releases.

Project details

Release history Release notifications | RSS feed

0.1.10

Apr 7, 2026

0.1.9

Apr 7, 2026

0.1.8

Apr 7, 2026

0.1.7

Apr 7, 2026

This version

0.1.6

Apr 7, 2026

0.1.5

Apr 7, 2026

0.1.3

Apr 7, 2026

0.1.2

Apr 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nishant_ragkit-0.1.6.tar.gz (18.5 kB view details)

Uploaded Apr 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nishant_ragkit-0.1.6-py3-none-any.whl (24.2 kB view details)

Uploaded Apr 7, 2026 Python 3

File details

Details for the file nishant_ragkit-0.1.6.tar.gz.

File metadata

Download URL: nishant_ragkit-0.1.6.tar.gz
Upload date: Apr 7, 2026
Size: 18.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for nishant_ragkit-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`609dfee97d1cb37d6863971c0c11b5abbe1031cc6e0abd560b1a0dd8cf2191f8`
MD5	`2cec2b65466e2672c811c6eee56f45ac`
BLAKE2b-256	`9731c3521c64e7444bd5d4e0b9dece99e2ccf86efebc71ad9e8842631471501a`

See more details on using hashes here.

File details

Details for the file nishant_ragkit-0.1.6-py3-none-any.whl.

File metadata

Download URL: nishant_ragkit-0.1.6-py3-none-any.whl
Upload date: Apr 7, 2026
Size: 24.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for nishant_ragkit-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0940fbaf44310e90af4561d56cd365dff33942524469182adc9c47a3f09c6eb1`
MD5	`ef905d1854f9b0fe2ea2948c8b0dfd79`
BLAKE2b-256	`a5b16bbb2f770f6a5877aeab668cb0a18e2f7f928bbe9965be325443b9f0bb8b`

See more details on using hashes here.

nishant-ragkit 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

rag-kit

Features

Installation

Basic install

Optional provider extras

Local development install

Environment Variables

Quick Start

Stateless Q&A

Chat with memory

Return sources

ask() vs chat()

LLM Providers

Default: Sarvam

OpenAI

Claude

Custom LLM

Configuration

Add More Documents

Reset Chat

Project Structure

Do You Need requirements.txt?

Recommendation

Current Limitations

Roadmap

Examples

License

Version

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Do You Need `requirements.txt`?