Portable Retrieval-Augmented Generation Library

These details have not been verified by PyPI

Project links

Project description

ragpackai 📦

Portable Retrieval-Augmented Generation Library

ragpackai is a Python library for creating, saving, loading, and querying portable RAG (Retrieval-Augmented Generation) packs. It allows you to bundle documents, embeddings, vectorstores, and configuration into a single .rag file that can be easily shared and deployed across different environments.

✨ Features

🚀 Portable RAG Packs: Bundle everything into a single .rag file
🔄 Provider Flexibility: Support for OpenAI, Google, Groq, Cerebras, and HuggingFace
🔒 Encryption Support: Optional AES-GCM encryption for sensitive data
🎯 Runtime Overrides: Change embedding/LLM providers without rebuilding
📚 Multiple Formats: Support for PDF, TXT, MD, and more
🛠️ CLI Tools: Command-line interface for easy pack management
🔧 Lazy Loading: Efficient dependency management with lazy imports

🚀 Quick Start

Installation

# Core installation
pip install ragpackai

# With optional providers
pip install ragpackai[google]     # Google Vertex AI
pip install ragpackai[groq]       # Groq
pip install ragpackai[cerebras]   # Cerebras
pip install ragpackai[all]        # All providers

Basic Usage

from ragpackai import ragpackai

# Create a pack from documents
pack = ragpackai.from_files([
    "docs/manual.pdf", 
    "notes.txt",
    "knowledge_base/"
])

# Save the pack
pack.save("my_knowledge.rag")

# Load and query
pack = ragpackai.load("my_knowledge.rag")

# Simple retrieval (no LLM)
results = pack.query("How do I install this?", top_k=3)
print(results)

# Question answering with LLM
answer = pack.ask("What are the main features?")
print(answer)

Provider Overrides

# Load with different providers
pack = ragpackai.load(
    "my_knowledge.rag",
    embedding_config={
        "provider": "google", 
        "model_name": "textembedding-gecko"
    },
    llm_config={
        "provider": "groq", 
        "model_name": "mixtral-8x7b-32768"
    }
)

answer = pack.ask("Explain the architecture")

🛠️ Command Line Interface

Create a RAG Pack

# From files and directories
ragpackai create docs/ notes.txt --output knowledge.rag

# With custom settings
ragpackai create docs/ \
  --embedding-provider openai \
  --embedding-model text-embedding-3-large \
  --chunk-size 1024 \
  --encrypt-key mypassword

Query and Ask

# Simple retrieval
ragpackai query knowledge.rag "How to install?"

# Question answering
ragpackai ask knowledge.rag "What are the requirements?" \
  --llm-provider openai \
  --llm-model gpt-4o

# With provider overrides
ragpackai ask knowledge.rag "Explain the API" \
  --embedding-provider google \
  --embedding-model textembedding-gecko \
  --llm-provider groq \
  --llm-model mixtral-8x7b-32768

Pack Information

ragpackai info knowledge.rag

🏗️ Architecture

.rag File Structure

A .rag file is a structured zip archive:

mypack.rag
├── metadata.json          # Pack metadata
├── config.json           # Default configurations
├── documents/            # Original documents
│   ├── doc1.txt
│   └── doc2.pdf
└── vectorstore/          # Chroma vectorstore
    ├── chroma.sqlite3
    └── ...

Supported Providers

Embedding Providers:

openai: text-embedding-3-small, text-embedding-3-large
huggingface: all-MiniLM-L6-v2, all-mpnet-base-v2 (offline)
google: textembedding-gecko

LLM Providers:

openai: gpt-4o, gpt-4o-mini, gpt-3.5-turbo
google: gemini-pro, gemini-1.5-flash
groq: mixtral-8x7b-32768, llama2-70b-4096
cerebras: llama3.1-8b, llama3.1-70b

📖 API Reference

ragpackai Class

`ragpackai.from_files(files, embed_model="openai:text-embedding-3-small", **kwargs)`

Create a RAG pack from files.

Parameters:

files: List of file paths or directories
embed_model: Embedding model in format "provider:model"
chunk_size: Text chunk size (default: 512)
chunk_overlap: Chunk overlap (default: 50)
name: Pack name

`ragpackai.load(path, embedding_config=None, llm_config=None, **kwargs)`

Load a RAG pack from file.

Parameters:

path: Path to .rag file
embedding_config: Override embedding configuration
llm_config: Override LLM configuration
reindex_on_mismatch: Rebuild vectorstore if dimensions mismatch
decrypt_key: Decryption password

`pack.save(path, encrypt_key=None)`

Save pack to .rag file.

`pack.query(question, top_k=3)`

Retrieve relevant chunks (no LLM).

`pack.ask(question, top_k=4, temperature=0.0)`

Ask question with LLM.

Provider Wrappers

# Direct provider access
from ragpackai.embeddings import OpenAI, HuggingFace, Google
from ragpackai.llms import OpenAIChat, GoogleChat, GroqChat

# Create embedding provider
embeddings = OpenAI(model_name="text-embedding-3-large")
vectors = embeddings.embed_documents(["Hello world"])

# Create LLM provider
llm = OpenAIChat(model_name="gpt-4o", temperature=0.7)
response = llm.invoke("What is AI?")

🔧 Configuration

Environment Variables

# API Keys
export OPENAI_API_KEY="your-key"
export GOOGLE_CLOUD_PROJECT="your-project"
export GROQ_API_KEY="your-key"
export CEREBRAS_API_KEY="your-key"

# Optional
export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account.json"

Configuration Files

# Custom embedding config
embedding_config = {
    "provider": "huggingface",
    "model_name": "all-mpnet-base-v2",
    "device": "cuda"  # Use GPU
}

# Custom LLM config
llm_config = {
    "provider": "openai",
    "model_name": "gpt-4o",
    "temperature": 0.7,
    "max_tokens": 2000
}

🔒 Security

Encryption

ragpackai supports AES-GCM encryption for sensitive data:

# Save with encryption
pack.save("sensitive.rag", encrypt_key="strong-password")

# Load encrypted pack
pack = ragpackai.load("sensitive.rag", decrypt_key="strong-password")

Best Practices

Use strong passwords for encryption
Store API keys securely in environment variables
Validate .rag files before loading in production
Consider network security when sharing packs

🧪 Examples

See the examples/ directory for complete examples:

basic_usage.py - Simple pack creation and querying
provider_overrides.py - Using different providers
encryption_example.py - Working with encrypted packs
cli_examples.sh - Command-line usage examples

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🆘 Support

🙏 Acknowledgments

Built with:

LangChain - LLM framework
ChromaDB - Vector database
Sentence Transformers - Embedding models

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.4

Aug 28, 2025

0.1.3

Aug 28, 2025

This version

0.1.2

Aug 28, 2025

0.1.1

Aug 28, 2025

0.1.0

Aug 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragpackai-0.1.2.tar.gz (41.3 kB view details)

Uploaded Aug 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragpackai-0.1.2-py3-none-any.whl (33.8 kB view details)

Uploaded Aug 28, 2025 Python 3

File details

Details for the file ragpackai-0.1.2.tar.gz.

File metadata

Download URL: ragpackai-0.1.2.tar.gz
Upload date: Aug 28, 2025
Size: 41.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for ragpackai-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`0ebc6fe6af173fc5975095e1dec2173951fe3ccf2434178ab6f45de80f9d119b`
MD5	`3bc9ee33b5c43184943d700b44486a3d`
BLAKE2b-256	`2ec12f0b512cd493f569524467a435b6bc49b85d8a065fbd21faa538217096a2`

See more details on using hashes here.

File details

Details for the file ragpackai-0.1.2-py3-none-any.whl.

File metadata

Download URL: ragpackai-0.1.2-py3-none-any.whl
Upload date: Aug 28, 2025
Size: 33.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for ragpackai-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9723a5162c85f99db6da49a354db2b1620f8bea7aa695a779312e9842fc9ff8e`
MD5	`272eb35bc0fb4bdfec1f505557e27727`
BLAKE2b-256	`4a4ff2d1ceb7ade354337aefc6da2ee1cb27647e716fc4ce7edb72973c80cff5`

See more details on using hashes here.

ragpackai 0.1.2

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

ragpackai 📦

✨ Features

🚀 Quick Start

Installation

Basic Usage

Provider Overrides

🛠️ Command Line Interface

Create a RAG Pack

Query and Ask

Pack Information

🏗️ Architecture

.rag File Structure

Supported Providers

📖 API Reference

ragpackai Class

ragpackai.from_files(files, embed_model="openai:text-embedding-3-small", **kwargs)

ragpackai.load(path, embedding_config=None, llm_config=None, **kwargs)

pack.save(path, encrypt_key=None)

pack.query(question, top_k=3)

pack.ask(question, top_k=4, temperature=0.0)

Provider Wrappers

🔧 Configuration

Environment Variables

Configuration Files

🔒 Security

Encryption

Best Practices

🧪 Examples

🤝 Contributing

📄 License

🆘 Support

🙏 Acknowledgments

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`ragpackai.from_files(files, embed_model="openai:text-embedding-3-small", **kwargs)`

`ragpackai.load(path, embedding_config=None, llm_config=None, **kwargs)`

`pack.save(path, encrypt_key=None)`

`pack.query(question, top_k=3)`

`pack.ask(question, top_k=4, temperature=0.0)`