Skip to main content

The Ultimate Hybrid RAG Framework: Local/Remote LLMs, Live Watcher, Deep Profiling & Security.

Project description

PyPI version Python versions License Downloads

🧠 RostaingChain

The Ultimate Hybrid RAG Framework.
Local & Remote LLMs | Real-time Watcher | Deep Data Profiling | DLP Security | Multi-Modal

RostaingChain is a production-ready framework designed to build autonomous RAG (Retrieval-Augmented Generation) systems. It bridges the gap between local privacy (Ollama, Local Docs) and cloud power (OpenAI, Groq, Datastores), featuring a unique Live Watcher that updates your AI's knowledge in real-time.


🚀 Key Features

  • Hybrid Intelligence: Switch instantly between Local LLMs (Ollama, Llama.cpp) and Remote giants (OpenAI, Groq, Claude, Gemini, DeepSeek, Grok).
  • Live Watcher (Auto-Sync): Drop a file in a folder, modify a SQL row, or update a website -> The AI learns it instantly.
  • Deep Profiling (Anti-Hallucination): Automatically calculates statistics (Max, Min, Mean) for CSV/SQL data so the LLM never hallucinates numbers.
  • DLP Security: Built-in Redaction system to mask sensitive data (IBAN, BIC, PHONE, EMAIL, CREDIT_CARD, ID_NUM, MONEY, IP_ADDR) before display.
  • Multi-Modal Native: Understands Text, PDFs (OCR included), Images, Audio (Whisper), and YouTube videos.
  • Universal Sources: Connects to Local Files, PostgreSQL, MySQL, Oracle, SQLite, MongoDB, Neo4j, and the Web.

📦 Installation

pip install rostaingchain
# Optional: Install OCR capabilities
pip install rostaing-ocr

🔑 Managing API Keys (Remote LLMs)

To use remote LLMs (like GPT-4, Groq, Claude) without hardcoding your credentials in the code, RostaingChain supports environment variables.

  1. Create a file named .env in your project root.
  2. Add your API keys following this format:
# Standard Providers
OPENAI_API_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_API_KEY=AIzaSy...

# Fast Inference Providers
GROQ_API_KEY=gsk_...
MISTRAL_API_KEY=...

# OpenAI-Compatible Providers
DEEPSEEK_API_KEY=sk-...
XAI_API_KEY=...
  1. Load the keys at the start of your script using python-dotenv:
pip install python-dotenv

⚡ Quick Start

1. The "Chat with Anything" Mode

Simply point data_source to a file, a folder, a database, or a URL.

from rostaingchain import RostaingBrain

# Initialize the Brain
agent = RostaingBrain(
    llm_model="llama3.2",          # Use local Ollama
    data_source="./my_documents", # Watches this folder
    auto_update=True             # Real-time ingestion
)

# Chat
response = agent.chat("What are the main topics in these documents?")
print(response)

🛠️ Advanced Usage

1. YouTube Video Analysis

Extract transcripts and metadata automatically.

from rostaingchain import RostaingBrain
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()

agent = RostaingBrain(
    llm_model="openai/gpt-oss-120b",
    llm_provider="groq",
    data_source="https://www.youtube.com/watch?v=3mTK0vYYXA4",
    vector_db="faiss"
)

# Streaming response for better UX
generator = agent.chat("Summarize this video in 3 bullet points.", stream=True)

for token in generator:
    print(token, end="", flush=True)

2. Data Security (DLP)

Protect sensitive information from being displayed.

from rostaingchain import RostaingBrain

agent = RostaingBrain(
    llm_model="llama3.2",
    data_source="bank_statements.pdf",
    # Enable Security
    security_filters=["IBAN", "BIC", "PHONE", "EMAIL", "MONEY", "CREDIT_CARD"] # Options: True or False
)

response = agent.chat("Give me the IBAN of the supplier.")
# Output: "The IBAN is [Protected IBAN bank details]."

3. Working with DataFrames (Pandas)

import pandas as pd
from rostaingchain import RostaingBrain
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()

df = pd.read_csv("titanic.csv") # supports: Polars

# Direct Memory Ingestion
agent = RostaingBrain(
    llm_model="gpt-4o",
    data_source=df 
)

print(agent.chat("What is the average age of passengers?"))

Here is the updated section to add to your README.md. It includes the specific example you requested, along with the detailed English explanations for stream, output_format, and vector_db.


4. Audio Analysis with Streaming & Markdown Output

RostaingChain natively handles audio files (like .m4a, .mp3) using OpenAI Whisper locally. This example demonstrates how to process an audio file, enforce security filters, and stream the result in a specific JSON format.

from rostaingchain import RostaingBrain
from dotenv import load_dotenv
import os
# Load environment variables from .env file
load_dotenv()

# Assuming your API key is set
llm_api_key = os.getenv("GROQ_API_KEY")

agent = RostaingBrain(
    llm_model="openai/gpt-oss-120b",
    llm_provider="groq",
    llm_api_key=llm_api_key,
    data_source="C:/Users/Rostaing/Desktop/data/audio.m4a", # Supports: .m4a, .mp3, .wav, .ogg, .flac, .webm
    poll_interval=3600, # Check for file updates every hour
    vector_db="faiss",  # Options: 'faiss' or 'chroma'
    reset_db=True,      # Re-index the file on startup
    memory=True,        # Enable conversation history
    security_filters=["PHONE", "BIC", "IBAN", "DATE"] # Active Data Loss Prevention
)

# Request a summary in JSON format with streaming enabled
response = agent.chat("Give me a summary.", stream=True, output_format="markdown") # output_format supports: "json", "text (default)", "markdown", "toon"

# Real-time display loop
for token in reponse:
    # Prints every token as soon as it arrives (ChatGPT-like effect)
    print(token, end="", flush=True)

5. Chat with a Website (Web RAG)

from RostaingChain import RostaingBrain
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()

# Direct Memory Ingestion
agent = RostaingBrain(
    llm_model="gpt-4o",
    llm_provider="openai",
    data_source="https://en.wikipedia.org/wiki/Artificial_intelligence",
    vector_db="chroma",  # Options: 'faiss' or 'chroma'
)

response = gent.chat("Give me a summary.")
print(response)

6. Chat with an image (RAG)

from RostaingChain import RostaingBrain

# Direct Memory Ingestion
agent = RostaingBrain(
    llm_model="llama3.2", # Ensure you ran 'ollama pull llama3.2' in your terminal
    llm_provider="ollama", # Runs 100% locally on your machine for privacy
    embedding_model="nomic-embed-text", # Ensure you ran 'ollama pull nomic-embed-text' in your terminal
    data_source="invoice.jpg", # Supports: .png, .jpeg, .bmp, .tiff, .webp
    memory=True, # Enable conversation history
    vector_db="chroma",  # Options: 'faiss' or 'chroma'
)

response = gent.chat("Give me a summary.")
print(response)

7. Video Analysis with Streaming & Markdown Output

from RostaingChain import RostaingBrain
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()

# Direct Memory Ingestion
agent = RostaingBrain(
    llm_model="gpt-4o",
    llm_provider="openai",
    data_source="my_video.mp4", # Supports: .avi, .mov, .mkv
    vector_db="chroma",  # Options: 'faiss' or 'chroma'
)

response = gent.chat("Give me a summary.", stream=True, output_format="markdown") # output_format supports: "json", "text (default)", "markdown", "toon"

# Real-time display loop
for token in reponse:
    # Prints every token as soon as it arrives (ChatGPT-like effect)
    print(token, end="", flush=True)

8. Chat with a file (Streaming RAG)

from RostaingChain import RostaingBrain
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()

# Direct Memory Ingestion
agent = RostaingBrain(
    llm_model="gpt-4o",
    llm_provider="openai",
    data_source="my_file.txt", # Supports: .pdf, .docx, .doc, .xlsx, .xls, .pptx, .ppt, .html, .htm, .xml, .epub, .md, .json, .log, .py, .js, .sql, .yaml, .ini
    vector_db="chroma",  # Options: 'faiss' or 'chroma'
)

response = gent.chat("Give me a summary.", stream=True, output_format="markdown")

# Real-time display loop
for token in reponse:
    # Prints every token as soon as it arrives (ChatGPT-like effect)
    print(token, end="", flush=True)

9. Connecting to Databases (SQL / NoSQL)

RostaingChain uses a Polling Watcher to monitor database changes.

from rostaingchain import RostaingBrain
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()

# PostgreSQL Configuration
db_config = {
    "type": "sql",
    "connection_string": "postgresql+psycopg2://user:pass@localhost:5432/finance_db",
    "query": "SELECT * FROM sales_2024"
}

agent = RostaingBrain(
    llm_model="gpt-4o",
    llm_provider="openai",
    data_source=db_config,
    poll_interval=30, # Check for DB changes every 30 seconds
    reset_db=True     # Start with a fresh index
)

print(agent.chat("What is the total revenue for Q1?"))
# Thanks to Deep Profiling, the AI will know the exact sum/mean/max.

10 🗄️ Database Configuration Examples

To connect RostaingChain to a database, create a dictionary db_config and pass it to the data_source parameter.

1. SQL Databases (via SQLAlchemy)

PostgreSQL

pg_config = {
"type": "sql",
"connection_string": "postgresql+psycopg2://user:pass@localhost:5432/finance_db",
"query": "SELECT * FROM sales_2026"
}

MySQL

mysql_config = {
    "type": "sql",
    "connection_string": "mysql+pymysql://username:password@localhost:3306/my_database",
    "query": "SELECT * FROM orders WHERE status = 'shipped'"
}

Oracle

# Requires Oracle Instant Client installed
oracle_config = {
    "type": "sql",
    "connection_string": "oracle+cx_oracle://username:password@localhost:1521/?service_name=ORCL",
    "query": "SELECT * FROM employees"
}

SQLite

sqlite_config = {
    "type": "sql",
    "connection_string": "sqlite:///C:/path/to/my_data.db",
    "query": "SELECT * FROM invoices"
}

Microsoft SQL Server

mssql_config = {
    "type": "sql",
    "connection_string": "mssql+pymssql://username:password@localhost:1433/my_database",
    "query": "SELECT top 100 * FROM customers"
}

2. NoSQL Databases

MongoDB

mongo_config = {
    "type": "mongodb",
    "uri": "mongodb://localhost:27017/",
    "db": "ecommerce_db",
    "collection": "products",
    "limit": 50 # Optional: Limit the number of documents to ingest
}

Neo4j (Graph)

neo4j_config = {
    "type": "neo4j",
    "uri": "bolt://localhost:7687",
    "user": "neo4j",
    "password": "your_password",
    "query": "MATCH (p:Person)-[:WROTE]->(a:Article) RETURN p.name, a.title LIMIT 20"
}

Usage Example

agent = RostaingBrain(
    llm_model="gpt-4o",
    data_source=mysql_config, # Pass the dictionary here.
    poll_interval=60,         # Watch for changes every minute
    reset_db=True,
)

11. Use a custom LLM (e.g., vLLM on another server)

from RostaingChain import RostaingBrain
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()

# Direct Memory Ingestion
agent = RostaingBrain(
    llm_model="my-finetuned-model",
    llm_provider="custom",
    llm_base_url="http://192.168.1.50:8000/v1", # Your vLLM server
    llm_api_key="token-if-needed",
    memory=True,
    vector_db="chroma",  # Options: 'faiss' or 'chroma'
    data_source="my_file.pdf", # Supports: .txt, .docx, .doc, .xlsx, .xls, .pptx, .ppt, .html, .htm, .xml, .epub, .md, .json, .log, .py, .js, .sql, .yaml, .ini, .jpg, .png, .jpeg, .bmp, .tiff, .webp, SQL/NoSQL Databases, Audio/Video/Web(link)
    reset_db=True, # Start with a fresh index
    temperature=0,
    top_k=0.1,
    top_p=1,
    max_tokens=1500
)

response = gent.chat("Give me a summary.", stream=True)

# Real-time display loop
for token in reponse:
    # Prints every token as soon as it arrives (ChatGPT-like effect)
    print(token, end="", flush=True)

12. Universal Intelligence: Switching LLM Providers

A. Use DeepSeek (the cheaper GPT-4 alternative)

agent = RostaingBrain(
    llm_model="deepseek-chat", # Auto-detection
    provider="deepseek",
    # If the key is not in the .env:
    llm_api_key="sk-your-deepseek-key" 
)

B. Use Groq (Lightning speed – 500 tokens/s)

agent = RostaingBrain(
    llm_model="openai/gpt-oss-120b",
    llm_provider="groq" # Force the provider to ensure it
)

C. Use Claude Sonnet (Best for coding)

agent = RostaingBrain(
    llm_model="claude-4.5-sonnet",
    llm_provider="anthropic" # Force the provider to ensure it
)

D. Use Gemini 3 Pro (Google)

agent = RostaingBrain(
    llm_model="gemini-3-pro-preview",
    llm_provider="geoogle" # Force the provider to ensure it
)

E. Use Mixtral (via Groq for Speed)

agent = RostaingBrain(
    llm_model="mistral-large-2512",
    llm_provider="mistral" # Force the provider for ultra-fast inference
)

F. Use Grok (xAI)

agent = RostaingBrain(
     llm_model="grok-4.1",
    llm_provider="grok" # Automatically configures the xAI API base_url
)

G. Use OpenAI (GPT-4o)

agent = RostaingBrain(
    llm_model="gpt-4o",
    llm_provider="openai" # Automatically uses OPENAI_API_KEY from your .env file
)

H. Use Local LLMs (Ollama)

agent = RostaingBrain(
    llm_model="llama3.2",  # Ensure you ran 'ollama pull llama3.2' in your terminal
    llm_provider="ollama", # Runs 100% locally on your machine for privacy
    # llm_base_url="http://localhost:11434" # Optional: Default URL
)

📝 Key Parameters Explained

  • stream=True: This is essential for User Experience (UX). Instead of waiting for the entire response to be generated (which can take time for long summaries), the method returns a Python Generator. You must iterate over it (using a for loop) to display tokens in real-time, exactly like ChatGPT.

  • output_format: This parameter enforces the structure or style of the LLM's response. It accepts three values:

    • "text" (Default): A standard, conversational plain text response.
    • "json": Forces the LLM to output a valid JSON object. Extremely useful if you are building an API or need to parse the result programmatically.
    • "toon": Changes the persona to a funny, cartoon-like character.
  • vector_db: Defines the local vector storage engine. RostaingChain currently supports two robust, file-based options:

    • "chroma": Uses ChromaDB.
    • "faiss": Uses Facebook AI Similarity Search (highly efficient for CPU).

⚙️ Configuration Parameters

Parameter Type Default Description
llm_model str "llama3" Name of the model (e.g., "gpt-4o", "llama3").
llm_provider str "auto" "openai", "groq", "ollama", "anthropic"...
data_source str/dict/df "./data" File path, folder, URL, SQL config, or DataFrame.
vector_db str "chroma" "chroma", "faiss".
auto_update bool True Activates Watcher (File system or Polling).
poll_interval int 60 Seconds between DB/Web checks.
reset_db bool False Wipes vector DB on startup.
memory bool False Enables conversational history.
cache bool True Enables In-Memory caching for speed.
security_filters list/bool None List of filters (["IBAN", "PHONE"]) or True for all.
temperature float 0.1 Creativity of the model.

💡 Pro Tip: VSCode Autocomplete

Don't memorize the parameters! If you are using VSCode, you can view the complete list of available options for RostaingBrain instantly.

Just place your cursor inside the parentheses and press:

Ctrl + Space

This will trigger IntelliSense and display all configuration arguments (like memory, security_filters, temperature, cache, etc.) with their descriptions.

🏗️ Architecture

graph TD
    A[Data Source] -->|Watcher/Polling| B(Universal Loader)
    B -->|Deep Profiling & OCR| C{"Content Type?"}
    C -->|Text/Code| D[Text Splitter]
    C -->|Table/SQL| E[Statistical Summary]
    C -->|Audio/Video| F[Whisper Model]
    
    D & E & F --> G[Embeddings Manager]
    G --> H[("Vector Database")]
    
    User -->|Query| I[Core Engine]
    I -->|"Retrieval (MMR)"| H
    I -->|Context + History| J[LLM Engine]
    J -->|Raw Response| K[Security Layer]
    K -->|Clean Response| User

Useful Links

📄 License

MIT License. See LICENSE for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rostaingchain-0.1.2.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rostaingchain-0.1.2-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file rostaingchain-0.1.2.tar.gz.

File metadata

  • Download URL: rostaingchain-0.1.2.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for rostaingchain-0.1.2.tar.gz
Algorithm Hash digest
SHA256 9669a5d5ee544f0ea088af623cf86a5d2d223f5e26a05b7c1e86133d2cc55aef
MD5 694b7b5f1a114f3742864edce635b51e
BLAKE2b-256 4bcfc2aa14f7e160416c4bf6b3c0183ae4bf47458fc3bd6783f642c969c4a3d6

See more details on using hashes here.

File details

Details for the file rostaingchain-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: rostaingchain-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for rostaingchain-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 87749ba35b879992e3e1dfd1c707fc0acf01b1c81671f82c17663dc22a34b101
MD5 886ccf587598de1296350e8014789ea1
BLAKE2b-256 c4145f2b78a574f70de92e584fc90f8a5f79649d3497c35e37d2ed051903411c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page