Skip to main content

Portable executable RAG artifacts for Python

Project description

RagBucket

Portable Executable RAG Artifacts for Python

Build Retrieval-Augmented Generation systems as reusable, shareable, and executable .rag artifacts.

PyPIInstallationQuickstartFeaturesVision


What is RagBucket?

Traditional machine learning models are portable.

model.pt
model.onnx
model.gguf
model.h5

They can be:

  • saved
  • reused
  • shared
  • deployed anywhere

But modern Retrieval-Augmented Generation (RAG) systems are still fragmented.

A typical RAG pipeline depends on:

  • vector databases
  • embedding pipelines
  • chunking systems
  • retrievers
  • metadata stores
  • external infrastructure
  • provider-specific integrations

This makes RAG systems:

  • difficult to distribute
  • tightly coupled to infrastructure
  • hard to reproduce
  • non-portable

Introducing .rag

RagBucket introduces:

.rag

A portable executable artifact format for Retrieval-Augmented Generation systems.

A .rag artifact packages:

  • semantic embeddings
  • vector indexes
  • chunked knowledge
  • retrieval configuration
  • runtime metadata

into a single reusable file.

Build once. Load anywhere.


Core Idea

Documents
    ↓
RagBuilder
    ↓
model.rag
    ↓
RagRuntime
    ↓
Question Answering

The builder converts raw documents into a portable retrieval artifact.

The runtime loads the artifact and performs:

  • semantic retrieval
  • contextual augmentation
  • provider-based generation

using external LLM providers like:

  • Groq
  • OpenAI
  • Gemini
  • Anthropic

Installation

Using uv

uv add ragbucket

Using pip

pip install ragbucket

Quickstart

Step 1 — Build a .rag Artifact

from ragbucket import RagBuilder
from ragbucket import RagConfig


config = RagConfig(

    embedding_model="BAAI/bge-small-en-v1.5",

    chunk_size=512,

    chunk_overlap=50,

    top_k=3
)


builder = RagBuilder(
    config=config
)


builder.build(
    doc_path="docs",
    op_path="artifacts/demo.rag"
)

This generates:

artifacts/demo.rag

Step 2 — Load and Use the Artifact

from ragbucket import RagRuntime

import os

from dotenv import load_dotenv


load_dotenv()


system_prompt = """
you are Anik's personal chatbot.

keep answers short and crisp.
"""


rag = RagRuntime(

    rag_path="artifacts/demo.rag",

    provider="groq",

    api_key=os.getenv("GROQ_API_KEY"),

    model="llama-3.1-8b-instant",

    system_prompt=system_prompt
)


response = rag.ask(
    "What are Anik's AIML skills?"
)

print(response)

Multi-Provider Runtime

RagBucket supports multiple LLM providers through a unified runtime abstraction.


Groq

provider="groq"
model="llama-3.1-8b-instant"

OpenAI

provider="openai"
model="gpt-4o-mini"

Gemini

provider="gemini"
model="gemini-1.5-flash"

Anthropic

provider="anthropic"
model="claude-3-haiku-20240307"

Runtime Pipeline

User Query
    ↓
Query Embedding
    ↓
Semantic Vector Search
    ↓
Relevant Context Retrieval
    ↓
LLM Provider
    ↓
Generated Response

Dynamic Configuration System

RagBucket supports configurable retrieval pipelines using RagConfig.

You can customize:

  • embedding model
  • chunk size
  • chunk overlap
  • retrieval top-k

Example:

from ragbucket import RagConfig


config = RagConfig(

    embedding_model="sentence-transformers/all-MiniLM-L6-v2",

    chunk_size=1024,

    chunk_overlap=100,

    top_k=5
)

Any missing configuration values are automatically filled using framework defaults.


Supported Embedding Models

RagBucket works with any compatible Sentence Transformers model.

Examples:

"BAAI/bge-small-en-v1.5"
"sentence-transformers/all-MiniLM-L6-v2"
"sentence-transformers/all-mpnet-base-v2"
"BAAI/bge-base-en-v1.5"

What a .rag File Contains

A .rag artifact stores:

  • semantic embeddings
  • FAISS vector index
  • chunked document memory
  • retrieval metadata
  • runtime configuration
  • artifact manifest

The only requirement during inference is:

  • an LLM provider API key

Features

Portable RAG Artifacts

Serialize retrieval systems into reusable .rag files.


Built-in Semantic Search

Uses FAISS for efficient vector similarity retrieval.


Multi-Provider Runtime

Unified runtime interface for:

  • Groq
  • OpenAI
  • Gemini
  • Anthropic

Configurable Retrieval Pipeline

Customize chunking and embedding behavior using RagConfig.


Lightweight Runtime

Load and execute .rag artifacts anywhere using Python.


Self-Contained Retrieval Memory

The artifact itself contains the retrieval system.


Simple Developer API

Minimal abstractions for building and querying portable RAG systems.


Extensible Architecture

Designed for future support of:

  • reranking
  • metadata filtering
  • hybrid retrieval
  • distributed vector stores
  • remote artifact registries

Technology Stack

Component Technology
Embeddings Sentence Transformers
Vector Search FAISS
Chunking LangChain
Runtime Python
Packaging zipfile
Artifact Format .rag

Philosophy

RagBucket treats RAG systems as:

portable intelligence artifacts

instead of:

fragmented retrieval pipelines

This separates:

  • retrieval memory from
  • language generation

allowing:

  • reusable semantic memory
  • infrastructure-independent retrieval
  • portable execution
  • simplified deployment

Current Scope

RagBucket currently supports:

  • local .rag artifact generation
  • semantic retrieval
  • configurable chunking
  • multi-provider inference
  • FAISS vector indexing
  • provider-based generation

The project is intentionally lightweight and focused on:

portable RAG execution


Future Roadmap

Planned features:

  • hybrid retrieval
  • metadata-aware search
  • reranking support
  • artifact versioning
  • remote artifact loading
  • distributed vector stores
  • multi-vector retrieval
  • .rag registries

Vision

RagBucket aims to become:

"The portable runtime layer for Retrieval-Augmented Generation systems."

A future where RAG systems can be:

  • built once
  • shared anywhere
  • executed everywhere

through standardized portable intelligence artifacts.


License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragbucket-0.2.1.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragbucket-0.2.1-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file ragbucket-0.2.1.tar.gz.

File metadata

  • Download URL: ragbucket-0.2.1.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for ragbucket-0.2.1.tar.gz
Algorithm Hash digest
SHA256 6d0b564e7191aedeb770d1fc7fd06b5e3907effc35147959da7e4275e5f3d77e
MD5 2c75f40ad6ec3ddf27a82d522dc2bca5
BLAKE2b-256 ffa072ca36a7a66291d015e60f8a9b40320c82a2d2578c33a810b83c90bd1b7e

See more details on using hashes here.

File details

Details for the file ragbucket-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: ragbucket-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for ragbucket-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 16453e0e47484bd7fb3104c15ae246bde1d634e1165ad881d19e68fc212b9171
MD5 c837484bfa66d558ca75acf95b062957
BLAKE2b-256 4341a5b8d6db75f92cad1bf85e5474d2ae7391f4dc04314eb3e9d11c67ae4f1a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page