Plug-and-play RAG pipeline library for Python. Load, chunk, embed, store, retrieve, and generate â€” all in one clean API.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Shomi

These details have not been verified by PyPI

Project description

rag-bridge-kit

rag-bridge-kit is a plug-and-play Retrieval Augmented Generation pipeline library for Python.

Load, chunk, embed, store, retrieve, and generate â€” all in one clean API.

Why rag-kit?

Zero config â€” works out of the box with sensible defaults.
Modular â€” swap any component (loader, chunker, embedder, store, generator).
Lightweight â€” no heavy dependencies by default.
Production-ready â€” batch embedding, error handling, type hints everywhere.
Extensible â€” bring your own components by extending base classes.

Install

pip install -e .

With OpenAI support:

pip install -e ".[openai]"

With PDF support:

pip install -e ".[pdf]"

With ChromaDB (persistent vector store):

pip install -e ".[chromadb]"

With local sentence-transformers (no API key needed):

pip install -e ".[sentence-transformers]"

Install everything:

pip install -e ".[all]"

For development:

pip install -e ".[dev,all]"

Quick Start

from rag_bridge_kit import RAGPipeline

pipeline = RAGPipeline()

# Ingest documents
pipeline.ingest_texts([
    "Python is a high-level programming language.",
    "Machine learning is a subset of AI.",
    "RAG combines retrieval with generation.",
])

# Query
result = pipeline.query("What is RAG?")
print(result.answer)
print(f"Chunks retrieved: {len(result.retrieved_chunks)}")

Load from Files

from rag_bridge_kit import RAGPipeline
from rag_bridge_kit.loaders import TextLoader

pipeline = RAGPipeline(loader=TextLoader("docs/"))
stats = pipeline.ingest()
print(f"Ingested {stats.documents_loaded} docs, {stats.chunks_stored} chunks")

result = pipeline.query("What is the refund policy?")
print(result.answer)

Load PDFs

from rag_bridge_kit import RAGPipeline
from rag_bridge_kit.loaders import PDFLoader

pipeline = RAGPipeline(loader=PDFLoader("reports/"))
pipeline.ingest()
result = pipeline.query("What were Q4 earnings?")

Load CSVs

from rag_bridge_kit import RAGPipeline
from rag_bridge_kit.loaders import CSVLoader

pipeline = RAGPipeline(
    loader=CSVLoader("faq.csv", content_columns=["question", "answer"])
)
pipeline.ingest()
result = pipeline.query("How do I reset my password?")

Load Markdown (split by headings)

from rag_bridge_kit import RAGPipeline
from rag_bridge_kit.loaders import MarkdownLoader

pipeline = RAGPipeline(
    loader=MarkdownLoader("docs/", split_by_heading=True, heading_level=2)
)
pipeline.ingest()
result = pipeline.query("How to install?")

Choose Your Chunking Strategy

from rag_bridge_kit import RAGPipeline
from rag_bridge_kit.chunkers import FixedChunker, SentenceChunker, RecursiveChunker

# Fixed-size character chunks
pipeline = RAGPipeline(chunker=FixedChunker(chunk_size=512, chunk_overlap=64))

# Sentence-based chunks
pipeline = RAGPipeline(chunker=SentenceChunker(max_chunk_size=512, sentence_overlap=1))

# Recursive splitting (like LangChain)
pipeline = RAGPipeline(chunker=RecursiveChunker(chunk_size=512, chunk_overlap=64))

Use OpenAI Embeddings + Generation

import os
from rag_bridge_kit import RAGPipeline
from rag_bridge_kit.embedders import OpenAIEmbedder
from rag_bridge_kit.generators import OpenAIGenerator

api_key = os.environ["OPENAI_API_KEY"]

pipeline = RAGPipeline(
    embedder=OpenAIEmbedder(api_key=api_key),
    generator=OpenAIGenerator(api_key=api_key, model="gpt-4o-mini"),
)

pipeline.ingest_texts(["Your documents here..."])
result = pipeline.query("Your question here?")
print(result.answer)

Use Local Embeddings (SentenceTransformers)

from rag_bridge_kit import RAGPipeline
from rag_bridge_kit.embedders import SentenceTransformerEmbedder

pipeline = RAGPipeline(
    embedder=SentenceTransformerEmbedder(model_name="all-MiniLM-L6-v2"),
)

pipeline.ingest_texts(["Your documents..."])
result = pipeline.query("Your question?")

Persistent Storage with ChromaDB

from rag_bridge_kit import RAGPipeline
from rag_bridge_kit.stores import ChromaStore

pipeline = RAGPipeline(
    store=ChromaStore(collection_name="my-docs", persist_directory="./chroma_db"),
)

# Data persists across restarts!
pipeline.ingest_texts(["Important document content..."])

Retrieve Without Generating

pipeline = RAGPipeline()
pipeline.ingest_texts(["Doc 1...", "Doc 2..."])

# Just get the relevant chunks
chunks = pipeline.retrieve("search query", top_k=3)
for chunk in chunks:
    print(f"Score: {chunk.score:.4f} | {chunk.content[:80]}...")

Architecture

â”Œâ”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”
â”‚                     RAGPipeline                         â”‚
â”œâ”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”¤
â”‚                                                         â”‚
â”‚  INGEST:   Loader â†’ Chunker â†’ Embedder â†’ Store          â”‚
â”‚                                                         â”‚
â”‚  QUERY:    Embedder â†’ Store (search) â†’ Generator         â”‚
â”‚                                                         â”‚
â”œâ”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”¤
â”‚  Loaders:    TextLoader, PDFLoader, CSVLoader,           â”‚
â”‚              MarkdownLoader                              â”‚
â”‚                                                         â”‚
â”‚  Chunkers:   FixedChunker, SentenceChunker,              â”‚
â”‚              RecursiveChunker                             â”‚
â”‚                                                         â”‚
â”‚  Embedders:  DefaultEmbedder, OpenAIEmbedder,            â”‚
â”‚              SentenceTransformerEmbedder                  â”‚
â”‚                                                         â”‚
â”‚  Stores:     MemoryStore, ChromaStore                     â”‚
â”‚                                                         â”‚
â”‚  Generators: DefaultGenerator, OpenAIGenerator            â”‚
â””â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”€â”˜

CLI

rag-bridge-kit info
rag-bridge-kit ingest ./docs --glob "*.txt"
rag-bridge-kit query ./docs -q "What is RAG?" --top-k 3

Environment Variables

Variable	Default	Description
`RAGKIT_CHUNK_SIZE`	`512`	Default chunk size
`RAGKIT_CHUNK_OVERLAP`	`64`	Default chunk overlap
`RAGKIT_TOP_K`	`5`	Default number of results
`RAGKIT_SIMILARITY_THRESHOLD`	`0.0`	Minimum similarity score
`RAGKIT_EMBEDDING_BATCH_SIZE`	`64`	Batch size for embeddings

Run Tests

pip install -e ".[dev]"
python -m pytest

Publish to PyPI

python -m build
twine upload dist/*

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Shomi

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jun 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_bridge_kit-0.1.0.tar.gz (24.8 kB view details)

Uploaded Jun 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rag_bridge_kit-0.1.0-py3-none-any.whl (33.7 kB view details)

Uploaded Jun 11, 2026 Python 3

File details

Details for the file rag_bridge_kit-0.1.0.tar.gz.

File metadata

Download URL: rag_bridge_kit-0.1.0.tar.gz
Upload date: Jun 11, 2026
Size: 24.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rag_bridge_kit-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a88d43752991cf9d4ec0eb50206b319824b767c9c7fa97d9fdf942a22e59b8e2`
MD5	`78fb03dccdd5e17c6b97e8707627ef72`
BLAKE2b-256	`39f021c0d15975feaf4f77f9f77470f20a5b8fd4d051484278a1e810d432afa6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rag_bridge_kit-0.1.0.tar.gz:

Publisher: publish.yml on sohammmmm10/rag-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rag_bridge_kit-0.1.0.tar.gz
- Subject digest: a88d43752991cf9d4ec0eb50206b319824b767c9c7fa97d9fdf942a22e59b8e2
- Sigstore transparency entry: 1789249306
- Sigstore integration time: Jun 11, 2026
Source repository:
- Permalink: sohammmmm10/rag-kit@138f52ce69f931533fb2d22a89eaae4078a961be
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/sohammmmm10
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@138f52ce69f931533fb2d22a89eaae4078a961be
- Trigger Event: release

File details

Details for the file rag_bridge_kit-0.1.0-py3-none-any.whl.

File metadata

Download URL: rag_bridge_kit-0.1.0-py3-none-any.whl
Upload date: Jun 11, 2026
Size: 33.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rag_bridge_kit-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e258aa14f4f10cdd08f9ac37b976f22c5d301a3745d20a8b0b873b7855684e36`
MD5	`a97541c1937e8a33bbe06a304defe715`
BLAKE2b-256	`042d6bdc4a421f683e9c17631c6e5350aae9bfc1635278b9944dcd5d9b6bd89a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rag_bridge_kit-0.1.0-py3-none-any.whl:

Publisher: publish.yml on sohammmmm10/rag-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rag_bridge_kit-0.1.0-py3-none-any.whl
- Subject digest: e258aa14f4f10cdd08f9ac37b976f22c5d301a3745d20a8b0b873b7855684e36
- Sigstore transparency entry: 1789249335
- Sigstore integration time: Jun 11, 2026
Source repository:
- Permalink: sohammmmm10/rag-kit@138f52ce69f931533fb2d22a89eaae4078a961be
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/sohammmmm10
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@138f52ce69f931533fb2d22a89eaae4078a961be
- Trigger Event: release

rag-bridge-kit 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

rag-bridge-kit

Why rag-kit?

Install

Quick Start

Load from Files

Load PDFs

Load CSVs

Load Markdown (split by headings)

Choose Your Chunking Strategy

Use OpenAI Embeddings + Generation

Use Local Embeddings (SentenceTransformers)

Persistent Storage with ChromaDB

Retrieve Without Generating

Architecture

CLI

Environment Variables

Run Tests

Publish to PyPI

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance