Skip to main content

Multimodal RAG pipeline for low-compute, local, real-world deployment

Project description

RAG-LLM-API-Pipeline

A fully local GPU poor, multimodal Retrieval-Augmented Generation (RAG) system powered by open-source local LLMs. This pipeline is designed for operational technology environments to provide AI-assisted access to technical knowledge, manuals, and historical data — securely and offline, at min cost.


✅ Key Features

🔍 Retrieval-Augmented Generation (RAG) using FAISS + SentenceTransformers

🧠 Flexible LLM Integration with support for:
- Open-source HuggingFace models (Qwen, Mistral, etc.)
- Mixed precision support: fp32, fp16, bfloat16
- Dynamic model/device/precision switching via YAML

🔧 1-line YAML configuration to control:
- System-specific documents
- Embedding & generation models
- GPU/CPU inference toggle
- Index rebuilding, token limits, chunking
	
📂 Multimodal Input Support:
- PDFs
- Plain text
- Images (OCR via Tesseract)
- Audio (.wav)
- Video (.mp4)
	
💻 Multiple Interfaces:
  • CLI (rag-cli) for single-line querying
  • FastAPI-powered REST API for local serving
  • Lightweight HTML Web UI for interactive search

###🛠️ Per-system configuration via system.yaml for flexible deployments ###🔐 Fully local operation — no cloud dependencies required

###✅ One-line install via pip install rag-llm-api-pipeline ###✅ Quickstart guide and prebuilt example included ###✅ Runs on CPU or GPU with smart memory management ###✅ Web UI + CLI + API, all in one package


📦 Installation

pip install rag-llm-api-pipeline

🛠️ Setup Instructions (Windows + Anaconda)

1. Create Python Environment

conda create -n rag_env python=3.10
conda activate rag_env

2. Install Dependencies

Via Conda (system-level tools):

conda install -c conda-forge ffmpeg pytesseract pyaudio

Via Pip (Python packages):

pip install -r requirements.txt

Ensure Tesseract is installed and in your system PATH. You can get it from https://github.com/tesseract-ocr/tesseract.


🚀 Usage

Please review the quickstart guide.


🐧 Setup Instructions (Linux)

1. Create Python Environment

python3 -m venv rag_env
source rag_env/bin/activate

Or with conda:

conda create -n rag_env python=3.10
conda activate rag_env

2. Install System Dependencies

sudo apt update
sudo apt install -y ffmpeg tesseract-ocr libpulse-dev portaudio19-dev

Optional: install language packs for OCR (e.g., tesseract-ocr-eng).

3. Install Python Packages

pip install -r requirements.txt

🔁 Running the Application on Linux

CLI

python cli/main.py --system TestSystem --question "What is the restart sequence for this machine?"

API Server

uvicorn rag_llm_api_pipeline.api.server:app --host 0.0.0.0 --port 8000

cURL Query

curl -X POST http://localhost:8000/query \
     -H "Content-Type: application/json" \
     -d '{"system": "TestSystem", "question": "What does error E204 indicate?"}'

📚 How it Works

  1. Index Building:

    • Files are parsed using loader.py.
    • Text chunks are embedded with MiniLM.
    • FAISS index stores embeddings for fast similarity search.
  2. Query Execution:

    • User provides a natural language question.
    • Relevant text chunks are retrieved from the index.
    • LLM generates an answer based on retrieved context.

🧠 Model Info

  • All models are open-source and run offline.

You can replace with any local-compatible Hugging Face model.


🔐 Security & Offline Use

  • No cloud or external dependencies required after initial setup.
  • Ideal for OT environments.
  • All processing is local: embeddings, LLM inference, and data storage.

📜 License

MIT License


📧 Contact

For issues, improvements, or contributions, please open an issue or PR.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_llm_api_pipeline-0.7.1.tar.gz (22.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rag_llm_api_pipeline-0.7.1-py3-none-any.whl (18.5 kB view details)

Uploaded Python 3

File details

Details for the file rag_llm_api_pipeline-0.7.1.tar.gz.

File metadata

  • Download URL: rag_llm_api_pipeline-0.7.1.tar.gz
  • Upload date:
  • Size: 22.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for rag_llm_api_pipeline-0.7.1.tar.gz
Algorithm Hash digest
SHA256 b518ded0ea9af85b96c0129aacb3cee2ceba0a16ac240a6a956e0135fa10517b
MD5 6b14ebb7077a8985e4f1a81abbd3ec78
BLAKE2b-256 948e137a0f3759162ab5f109c8f9ffd1464e9636dceab563eb52f3da6fd8bebd

See more details on using hashes here.

File details

Details for the file rag_llm_api_pipeline-0.7.1-py3-none-any.whl.

File metadata

File hashes

Hashes for rag_llm_api_pipeline-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a7c83be5de138a4da55811d1595b7c93c7b07564e6cd2f6e72a098d364e0642a
MD5 685a025015af69f3d3a0c9cadcc9dda6
BLAKE2b-256 5c7f1970e858a90e69887432f108972b850fd4ccb99c0bb3c6065a2513124826

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page