Skip to main content

Privacy-first, fully offline AI document assistant secured by tiered safety guardrails

Project description


     



GuardRAG

A privacy-first, fully offline AI document assistant — secured by a tiered safety guardrails system


License: MIT Python PyPI Offline PRs Welcome


Upload any document. Ask anything. Get answers — entirely on your machine.
No cloud. No API keys. No data leaves your device.



Architecture

The GuardRAG package provides a command-line interface for building and querying RAG (Retrieval-Augmented Generation) chatbots with local LLMs.


Why GuardRAG?

Most RAG chatbots rely on cloud APIs, which creates privacy risks for sensitive documents — contracts, medical records, internal reports. GuardRAG solves that by:

  • Running the LLM locally via Ollama (no data transmitted)
  • Embedding documents offline using HuggingFace sentence-transformers
  • Enforcing tiered safety policies with 4 sensitivity levels
  • Providing a simple CLI interface for easy usage

Feature Highlights

Core

  • 100% Offline — zero external network calls at runtime
  • Multi-format ingestion — PDF, TXT, DOCX
  • Persistent FAISS cache — same file re-uploads skip re-embedding
  • Multi-turn conversation — full history-aware retrieval
  • Any Ollama model — Gemma, Llama3, Mistral, Phi, and more
  • CLI Interface — easy command-line usage

Safety

  • 4-Tier Data Sensitivity System — Public → Internal → Confidential → Restricted
  • Jailbreak / prompt injection detection — always active
  • Credential & API key protection — Internal+
  • PII protection — SSN, email, phone, DOB, credit card (Confidential+)
  • Regulated data guards — HIPAA / GDPR / financial categories (Restricted)

Data Sensitivity Levels

Level Badge What is Protected
Public Jailbreak & prompt injection only
Internal + API keys, credentials, passwords, tokens
Confidential + SSN, email, phone number, DOB, credit card
Restricted + Medical records, diagnoses, financials, HIPAA/GDPR

Tech Stack

Layer Technology
CLI Interface Python argparse + rich console output
LLM Engine Ollama — local model inference
Embeddings HuggingFace sentence-transformers/all-MiniLM-L6-v2
Vector Store FAISS — disk-persisted
RAG Pipeline LangChain — retrieval chains + chat history
Safety Rails Custom tiered guardrails system (input + output)

Prerequisites

  • Python 3.9+
  • Ollama installed and running locally
  • At least one model pulled via Ollama:
ollama pull gemma3:1b
# or any other model: llama3.1, phi3, mistral, etc.

Installation

Install GuardRAG from PyPI:

pip install guard-rag

Or install from source:

git clone https://github.com/sowmiyan-s/GUADRAILS-RAG-CHAT-TOOL.git
cd GUADRAILS-RAG-CHAT-TOOL
pip install .

Quick Start

After installation, run the CLI:

guard-rag --pdf path/to/your/document.pdf

This will start an interactive chat session with your document.

CLI Options

guard-rag --pdf <file>             Load and chat with a PDF document
          --model <model>          Ollama model to use (default: gemma3:1b)
          --ollama-host <url>      Ollama server URL (default: http://localhost:11434)
          --chunk-size <int>       Document chunk size (default: 1000)
          --chunk-overlap <int>    Chunk overlap (default: 200)
          --sensitivity <level>    Data sensitivity: Public | Internal | Confidential | Restricted
          --no-guardrails          Disable safety guardrails
          --help                   Show this help message

Example Session

# Start with a PDF using Llama 3.1
guard-rag --pdf report.pdf --model llama3.1 --sensitivity Confidential

# You: What are the key findings?
# Chatbot: Based on the document, the key findings are...

Project Structure

GUADRAILS-RAG-CHAT-TOOL/
│
├── guardrag/                 # Main installable package
│   ├── api/                  # FastAPI local server
│   ├── cli/                  # Command-line interface
│   ├── rag/                  # RAG pipeline logic
│   └── utils/                # General utilities
│
├── docs/                     # Documentation (INSTALL, QUICK_REFERENCE)
├── tests/                    # Unit and integration tests
├── scripts/                  # Development and maintenance scripts
├── extras/                   # Experimental / legacy components
│
├── pyproject.toml             # Modern build configuration
├── setup.py                   # Legacy support configuration
├── README.md                  # Project overview
├── CONTRIBUTING.md            # Contribution guidelines
├── CODE_OF_CONDUCT.md         # Community standards
└── LICENSE                    # MIT License open source

.guardrag_storage/ is auto-generated on first document load (FAISS cache).


Configuration

Environment Variables

Copy .env.example to .env and adjust as needed:

cp .env.example .env
Variable Default Description
OLLAMA_HOST http://localhost:11434 Ollama API endpoint
NO_PROXY huggingface.co,... Bypass proxy for local+HF calls
PORT 8000 Server port (auto-set by PaaS)

Chunking Parameters

Adjustable per-session via the sidebar in the UI:

  • Chunk Size (default 1000 chars)
  • Chunk Overlap (default 200 chars)

Different chunk settings for the same file produce a separate FAISS index automatically.


Deployment

From PyPI (recommended)

pip install guard-rag

From Source

git clone https://github.com/sowmiyan-s/GUADRAILS-RAG-CHAT-TOOL.git
cd GUADRAILS-RAG-CHAT-TOOL
pip install .

In a virtual environment (best practice)

python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS / Linux:
source .venv/bin/activate

pip install guard-rag

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.

Bug reports and feature requests are welcome via GitHub Issues.


License

This project is licensed under the MIT License — see LICENSE for details.


Built with ❤️ by Sowmiyan S

FastAPI · LangChain · Ollama · HuggingFace · FAISS · Vanilla JS

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

guard_rag-1.0.5.tar.gz (55.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

guard_rag-1.0.5-py3-none-any.whl (46.7 kB view details)

Uploaded Python 3

File details

Details for the file guard_rag-1.0.5.tar.gz.

File metadata

  • Download URL: guard_rag-1.0.5.tar.gz
  • Upload date:
  • Size: 55.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for guard_rag-1.0.5.tar.gz
Algorithm Hash digest
SHA256 49c8ed4d614eb5c1d21606f95433138370391d1b0adb63a4eaf9a5bd30d3b4f3
MD5 a842a1722e25cde84f102d4971acc8ee
BLAKE2b-256 b714c30d1ffb1621f43e358a71b6cfd9ed6909ab0dfd40909b5da2b38f8f5ce1

See more details on using hashes here.

File details

Details for the file guard_rag-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: guard_rag-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 46.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for guard_rag-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 ebda09a4f765509f068253993674d3b9f764c4003e9a61aa6c4c2215a6d61d30
MD5 78993bc4633156ea32a3f43cc462701e
BLAKE2b-256 d88d2b5fc0f90ccff9e1bc6049eeca01f67137a9a167bc85fa7a5b9b2d1798fa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page