A simple, clean Python library for Retrieval-Augmented Generation (RAG)

These details have not been verified by PyPI

Project links

Project description

🧾 Ragify

A simple, clean Python library to abstract away the complexity of Retrieval-Augmented Generation (RAG) by allowing developers to create embeddings and retrieve them using minimal setup.

🚀 Quick Start

Method 1: Initialize with Configuration Dictionary

from ragify import KaliRAG

# Initialize with your configuration
config = {
    "embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
    "db_config": {
        "api_key": "your-quadrant-api-key",
        "host": "https://api.quadrant.io",
        "collection": "my_docs"
    }
}

rag = KaliRAG(**config)

# Create and store embeddings
text = "Your long document text here..."
result = rag.create_store_embedding(text)

# Retrieve relevant chunks
response = rag.retrieve_embedding("What is the main topic?", top_k=3)
print(response)

Method 2: Configure Separately (Recommended)

from ragify import KaliRAG

# Initialize with defaults
rag = KaliRAG()

# Configure database with separate parameters
rag.configure_database(
    api_key="your-quadrant-api-key",
    host="https://api.quadrant.io",
    port=443,  # Optional
    collection="my_docs"
)

# Configure embedding model
rag.configure_embedding_model("sentence-transformers/all-MiniLM-L6-v2")

# Configure chunking parameters
rag.configure_chunking(chunk_size=512, chunk_overlap=50)

# Now use the configured RAG system
result = rag.create_store_embedding("Your text here...")

📦 Installation

pip install ragify

Or install from source:

git clone https://github.com/ragify/ragify.git
cd ragify
pip install -e .

🎯 Features

✅ Core Features

Simple API: Two main functions - create_store_embedding() and retrieve_embedding()
Smart Chunking: Automatic text chunking with configurable size and overlap
Multiple Embedding Models: Support for HuggingFace SentenceTransformers
Vector Database Integration: Native support for Quadrant (with mock mode for testing)
Configurable: Easy customization of chunking, embedding, and retrieval parameters

🔧 Advanced Features

Recursive Chunking: Automatically handles very long documents
Similarity Thresholds: Filter results by similarity score
Comprehensive Logging: Built-in logging for debugging and monitoring
Error Handling: Robust error handling with detailed error messages
Mock Mode: Works without external dependencies for testing

📚 Usage Examples

Basic Usage

from ragify import KaliRAG

# Initialize with defaults
rag = KaliRAG()

# Add your documents
text = """
RAG stands for Retrieval-Augmented Generation. It's a technique that combines 
large language models with external knowledge retrieval to provide more accurate 
and contextually relevant responses.
"""

# Create embeddings and store them
result = rag.create_store_embedding(text)
print(f"Created {result['chunks_created']} chunks")

# Query your knowledge base
response = rag.retrieve_embedding("What is RAG?", top_k=3)
for result in response['results']:
    print(f"Score: {result['similarity_score']:.3f}")
    print(f"Text: {result['text']}")

Advanced Configuration

# Custom configuration
config = {
    "embedding_model": "sentence-transformers/all-mpnet-base-v2",
    "chunk_size": 256,
    "chunk_overlap": 25,
    "db_config": {
        "api_key": "your-api-key",
        "collection": "custom_collection"
    }
}

rag = KaliRAG(**config)

# Use recursive chunking for very long documents
long_text = "..." * 1000  # Very long text
result = rag.create_store_embedding(
    long_text,
    use_recursive_chunking=True,
    max_recursion_depth=3
)

# Retrieve with custom parameters
response = rag.retrieve_embedding(
    "Your query",
    top_k=5,
    similarity_threshold=0.8
)

System Information

# Get system configuration
info = rag.get_info()
print(f"Embedding model: {info['embedding_model']}")
print(f"Embedding dimension: {info['embedding_dimension']}")
print(f"Chunk size: {info['chunk_size']}")

🏗️ Architecture

Ragify is built with a modular architecture:

ragify/
├── core/
│   ├── ragify.py          # Main KaliRAG class
│   ├── chunker.py         # Text chunking logic
│   ├── embedder.py        # Embedding model management
│   └── db_quadrant.py     # Vector database integration
├── utils/
│   └── logger.py          # Logging utilities
├── config/
│   └── defaults.py        # Default configurations
└── examples/
    └── basic_usage.py     # Usage examples

🔧 Configuration

Embedding Models

Supported models (via HuggingFace SentenceTransformers):

sentence-transformers/all-MiniLM-L6-v2 (default)
sentence-transformers/all-mpnet-base-v2
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

Chunking Parameters

chunk_size: Maximum size of each chunk (default: 512)
chunk_overlap: Overlap between consecutive chunks (default: 50)

Retrieval Parameters

top_k: Number of top results to return (default: 3)
similarity_threshold: Minimum similarity score (default: 0.7)

🧪 Testing

Run the test suite:

python -m pytest tests/

Run with coverage:

python -m pytest tests/ --cov=ragify

🖥️ Command Line Interface

Ragify includes a CLI for easy configuration and usage:

Configure Settings

# Configure database
ragify config --api-key "your-key" --host "https://api.quadrant.io" --collection "my_docs"

# Configure with port
ragify config --api-key "your-key" --host "https://api.quadrant.io" --port 443 --collection "my_docs"

# Configure embedding model
ragify config --model "sentence-transformers/all-mpnet-base-v2"

# Configure chunking
ragify config --chunk-size 256 --chunk-overlap 25

# Configure everything at once
ragify config --api-key "your-key" --model "all-MiniLM-L6-v2" --chunk-size 512

Get System Information

ragify info

Create Embeddings from File

ragify create --input document.txt --output results.json

Query the Knowledge Base

ragify query "What is RAG?" --top-k 5 --threshold 0.8

Reset Database

ragify reset

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

git clone https://github.com/ragify/ragify.git
cd ragify
pip install -e ".[dev]"

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

HuggingFace for the excellent SentenceTransformers library
Quadrant for the vector database integration
The open-source AI community for inspiration and feedback

📞 Support

📧 Email: team@ragify.com
🐛 Issues: GitHub Issues
📖 Documentation: Read the Docs

🚀 Roadmap

OpenAI embedding support
Additional vector databases (Chroma, FAISS, Pinecone)
File loaders (PDF, CSV, DOCX)
CLI interface
Web UI
FastAPI wrapper
Batch processing
Advanced chunking strategies

Made with ❤️ by the Ragify Team

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.5

Jul 11, 2025

0.1.4

Jul 11, 2025

0.1.3

Jul 11, 2025

0.1.2

Jul 11, 2025

This version

0.1.0

Jul 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragify_lib-0.1.0.tar.gz (21.8 kB view details)

Uploaded Jul 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragify_lib-0.1.0-py3-none-any.whl (19.8 kB view details)

Uploaded Jul 11, 2025 Python 3

File details

Details for the file ragify_lib-0.1.0.tar.gz.

File metadata

Download URL: ragify_lib-0.1.0.tar.gz
Upload date: Jul 11, 2025
Size: 21.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.0

File hashes

Hashes for ragify_lib-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`3eab6ed507b39eff362668176269a1e79f730d4858e54a1e23e52395969d1e86`
MD5	`942d948d1ccb7f3ec663e64b1ff8a72f`
BLAKE2b-256	`8f2c279cba1a66452e80f2200ff464d71454237ceef77d34bdbe6e5f437afb3a`

See more details on using hashes here.

File details

Details for the file ragify_lib-0.1.0-py3-none-any.whl.

File metadata

Download URL: ragify_lib-0.1.0-py3-none-any.whl
Upload date: Jul 11, 2025
Size: 19.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.0

File hashes

Hashes for ragify_lib-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2df6b5dac7d28d5a54b70594757e472531473f7105c490b2e20eff3bc8a748d9`
MD5	`430250e4a94151ef6ce70a910d8deaa7`
BLAKE2b-256	`dc6b6217e16f2ad872fcf1a2d453999cba629f674fe27658fb57cc398fc83bd0`

See more details on using hashes here.

ragify-lib 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🧾 Ragify

🚀 Quick Start

Method 1: Initialize with Configuration Dictionary

Method 2: Configure Separately (Recommended)

📦 Installation

🎯 Features

✅ Core Features

🔧 Advanced Features

📚 Usage Examples

Basic Usage

Advanced Configuration

System Information

🏗️ Architecture

🔧 Configuration

Embedding Models

Chunking Parameters

Retrieval Parameters

🧪 Testing

🖥️ Command Line Interface

Configure Settings

Get System Information

Create Embeddings from File

Query the Knowledge Base

Reset Database

🤝 Contributing

Development Setup

📄 License

🙏 Acknowledgments

📞 Support

🚀 Roadmap

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes