Portable executable RAG artifacts for Python

These details have not been verified by PyPI

Project description

RagBucket

Portable Executable RAG Artifacts for Python

Build once. Load anywhere. Query forever.

╔══════════════════════════════════════════════════════════════╗
║   RAG systems were never meant to be locked to infrastructure ║
║   RagBucket sets them free.                                   ║
╚══════════════════════════════════════════════════════════════╝

Problem · The .rag Format · Install · Quickstart · Providers · Roadmap

◈ The Problem

Every major ML format is portable by default:

model.pt   ·   model.onnx   ·   model.gguf   ·   model.h5

You save them, share them, deploy them anywhere.

RAG systems can't do any of that.

A typical RAG pipeline is a fragile web of moving parts:

❌  Vector databases tied to specific infrastructure
❌  Embedding pipelines that must be rebuilt from scratch
❌  Chunking configs scattered across codebases
❌  Provider-specific integrations with zero portability
❌  Metadata that lives nowhere and everywhere at once

Every time you switch environments — laptop to server, dev to prod, team to team — you rebuild the whole thing. That's broken.

RagBucket fixes this.

It packages your entire RAG pipeline — vectors, chunks, config, and runtime metadata — into a single portable .rag artifact. Like a model checkpoint, but for retrieval intelligence.

◈ Introducing `.rag`

A .rag artifact is a self-contained, executable unit of retrieval intelligence. It is not a config file. It is not a directory. It is a complete, ready-to-run retrieval system.

What it stores	How it stores it
Semantic embeddings	via Sentence Transformers
Vector index	via FAISS
Chunked knowledge	via LangChain splitters
Retrieval configuration	embedded in manifest
Runtime metadata	versioned artifact manifest

Build it once.
Drop it anywhere.
Query it with one line of code.

◈ Full Architecture

◈ Installation

# Using uv (recommended)
uv add ragbucket

# Using pip
pip install ragbucket

Lightweight by default. Local embedding dependencies are only pulled in when you set embedding_provider="local". Cloud providers add nothing to your base install.

◈ Quickstart

Step 1 — Build a Portable `.rag` Artifact

from ragbucket import RagBuilder, RagConfig
import os
from dotenv import load_dotenv

load_dotenv()

config = RagConfig(

    # ── Embedding Provider ────────────────────────────────────
    embedding_provider = "cohere",
    embedding_model    = "embed-english-v3.0",
    embedding_api_key  = os.getenv("COHERE_API_KEY"),

    # ── Chunking ──────────────────────────────────────────────
    chunk_size    = 512,
    chunk_overlap = 50,

    # ── Retrieval ─────────────────────────────────────────────
    top_k = 3,
)

builder = RagBuilder(config=config)

builder.build(
    doc_path = "docs/",
    op_path  = "artifacts/demo.rag",
)

This generates a single portable artifact:

artifacts/
└── demo.rag          ← your entire RAG pipeline, packaged

Containing:

demo.rag
├── vectors.faiss     ← semantic vector index
├── chunks.json       ← chunked document memory
└── manifest.json     ← embedding config + metadata

Build once. Query anywhere.

Step 2 — Load and Query the Artifact

from ragbucket import RagRuntime
import os
from dotenv import load_dotenv

load_dotenv()

rag = RagRuntime(

    # ── RAG Artifact ──────────────────────────────────────────
    rag_path = "artifacts/demo.rag",

    # ── Generation Provider ───────────────────────────────────
    provider = "groq",
    api_key  = os.getenv("GROQ_API_KEY"),
    model    = "llama-3.1-8b-instant",

    # ── Embedding Provider Key ────────────────────────────────
    embedding_api_key = os.getenv("COHERE_API_KEY"),

    # ── System Prompt ─────────────────────────────────────────
    system_prompt = "You are a helpful assistant. Keep answers short and crisp.",
)

response = rag.ask("What are Anik's AI/ML skills?")
print(response)

That's it. No vector DB to spin up. No pipeline to reconstruct. Just load and ask.

◈ Multi-Provider Runtime

RagBucket cleanly separates retrieval from generation — meaning you can mix and match embedding providers with generation providers freely.

Generation Providers

Provider	Example Model
`groq`	`llama-3.1-8b-instant`
`openai`	`gpt-4o-mini`
`gemini`	`gemini-1.5-flash`
`anthropic`	`claude-3-haiku-20240307`

# Swap providers without touching anything else
rag = RagRuntime(
    rag_path  = "demo.rag",
    provider  = "anthropic",
    api_key   = os.getenv("ANTHROPIC_API_KEY"),
    model     = "claude-3-haiku-20240307",
    embedding_api_key = os.getenv("COHERE_API_KEY"),
)

Embedding Providers

Provider	Example Model
`local`	`BAAI/bge-small-en-v1.5`
`cohere`	`embed-english-v3.0`
`openai`	`text-embedding-3-small`
`gemini`	`models/embedding-001`
`voyage`	`voyage-large-2`

# Use any embedding provider at build time
config = RagConfig(
    embedding_provider = "openai",
    embedding_model    = "text-embedding-3-small",
    embedding_api_key  = os.getenv("OPENAI_API_KEY"),
)

◈ Dynamic Retrieval Configuration

Every stage of the retrieval pipeline is configurable. Sane defaults are always applied automatically.

from ragbucket import RagConfig

config = RagConfig(

    # Embedding system
    embedding_provider = "local",
    embedding_model    = "sentence-transformers/all-MiniLM-L6-v2",

    # Chunking
    chunk_size    = 1024,
    chunk_overlap = 100,

    # Retrieval
    top_k = 5,
)

All missing values are filled using framework defaults. Nothing breaks if you leave something out.

◈ What a `.rag` Artifact Contains

demo.rag
│
├── vectors.faiss     ← FAISS vector index (semantic search backbone)
├── chunks.json       ← document chunks with source metadata
└── manifest.json     ← embedding config, top_k, model info, version

The artifact is entirely self-describing. Anyone who receives a .rag file has everything needed to query it — no external config, no infrastructure dependencies, no guesswork.

◈ Technology Stack

Component	Technology
Embeddings	Sentence Transformers
Vector Search	FAISS
Chunking	LangChain Text Splitters
Artifact Packaging	Python `zipfile`
Config Validation	Pydantic
Runtime	Pure Python

◈ Philosophy

RAG systems should be as portable as model files. Not as fragile as microservice stacks.

RagBucket treats RAG systems as portable intelligence artifacts — not fragile infrastructure pipelines. This cleanly separates two concerns that have no business being coupled:

Retrieval memory   →  what you built      →  lives in the .rag file
Language generation →  how you query it   →  any provider, any environment

Your retrieval knowledge travels with your code. Swap generation providers without rebuilding anything. Share a .rag file like you'd share a model checkpoint.

The result: reusable semantic memory that is fully decoupled from infrastructure.

◈ Links

Resource	URL
Website	ragbucket.vercel.app
PyPI	pypi.org/project/ragbucket
GitHub	github.com/anikchand461/ragbucket

◈ License

MIT License — see LICENSE for details.

◈ RagBucket

The portable runtime layer for Retrieval-Augmented Generation systems.

Built by Anik Chand · ragbucket.vercel.app

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.6

May 20, 2026

0.2.5

May 20, 2026

0.2.4

May 20, 2026

0.2.3

May 20, 2026

0.2.2

May 20, 2026

0.2.1

May 20, 2026

0.2.0

May 20, 2026

0.1.3

May 19, 2026

0.1.2

May 19, 2026

0.1.1

May 19, 2026

0.1.0

May 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragbucket-0.2.6.tar.gz (3.7 MB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragbucket-0.2.6-py3-none-any.whl (18.2 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file ragbucket-0.2.6.tar.gz.

File metadata

Download URL: ragbucket-0.2.6.tar.gz
Upload date: May 20, 2026
Size: 3.7 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for ragbucket-0.2.6.tar.gz
Algorithm	Hash digest
SHA256	`fd151443b4a2d23bd166d3364507ab4b66f4dd79c34862181184998e9095a608`
MD5	`ef01d740df7c96394dc77f0094e7bdc0`
BLAKE2b-256	`f74f02dc80d0f5c5fb338eefb2c6c9a8a6285ec0338e5a8cf9a49aea8bf0031f`

See more details on using hashes here.

File details

Details for the file ragbucket-0.2.6-py3-none-any.whl.

File metadata

Download URL: ragbucket-0.2.6-py3-none-any.whl
Upload date: May 20, 2026
Size: 18.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for ragbucket-0.2.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e0daddebc94ec7c0876f4821f92baa2c83581bcb3887b84bffc5ee03afb38172`
MD5	`1be901df7b523b71fe042cf832502a9e`
BLAKE2b-256	`4df131530aa41c7bac23ff78e40dc3b1def58da853881d63e280a9013a5a3cc1`

See more details on using hashes here.

ragbucket 0.2.6

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

RagBucket

Portable Executable RAG Artifacts for Python

◈ The Problem

◈ Introducing .rag

◈ Full Architecture

◈ Installation

◈ Quickstart

Step 1 — Build a Portable .rag Artifact

Step 2 — Load and Query the Artifact

◈ Multi-Provider Runtime

Generation Providers

Embedding Providers

◈ Dynamic Retrieval Configuration

◈ What a .rag Artifact Contains

◈ Technology Stack

◈ Philosophy

◈ Links

◈ License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

◈ Introducing `.rag`

Step 1 — Build a Portable `.rag` Artifact

◈ What a `.rag` Artifact Contains