Skip to main content

Private RAG System for Documents & Structured Data

Project description

Keovil

Private Query Interface for Documents & Structured Data

License: GPL v3 Python: 3.12+ Platform: Linux Hardware: Flexible GitHub Stars

Ask questions in plain English. Keovil queries your files.

Upload PDFs, text files, code, and more. Or query CSV, Excel, SQLite databases via the web app. Keovil generates the queries and returns results. Document processing runs locally on your GPU, with flexible LLM options (cloud or local).

Installation

Option 1: Install from GitHub (Recommended)

# Install the SDK and web app
pip install git+https://github.com/kaiserkonok/Keovil.git

Option 2: Local Development Install

# Clone the repository
git clone https://github.com/kaiserkonok/Keovil.git
cd Keovil

# Install in development mode
pip install -e .

Option 3: Web App Only (Quick Start - No Docker!)

# 1. Install Keovil
pip install git+https://github.com/kaiserkonok/Keovil.git

# 2. Run the web app (Qdrant runs automatically!)
python -m keovil_web

That's it! Qdrant runs in embedded mode automatically.


Optional: Run Qdrant Manually (For Better Performance)

If you want to run Qdrant externally (e.g., via Docker) for better performance:

# macOS
brew install qdrant && brew services start qdrant

# Linux (Docker)
docker run -d -p 6333:6333 -v qdrant_storage:/qdrant_storage qdrant/qdrant

Then set QDRANT_HOST=your-server if not on localhost.


Quick Start

Web App

# 1. Install Keovil
pip install git+https://github.com/kaiserkonok/Keovil.git

# 2. Run the web app (Qdrant runs automatically!)
python -m keovil_web

Open http://localhost:5000

That's it! Qdrant runs in embedded mode automatically. No Docker needed.

For local LLM, also install Ollama: curl -fsSL https://ollama.com/install.sh | sh

SDK (For Developers)

from keovil import KeovilRAG
from keovil.utils.llm_config import LLMConfig

# Initialize with default Ollama
rag = KeovilRAG(data_dir="/path/to/your/files")

# Or use a specific LLM provider
config = LLMConfig(provider="openai", model="gpt-4o", openai_api_key="sk-...")
rag = KeovilRAG(data_dir="/path/to/files", llm_config=config)

# Index your files (PDF, text, code, etc.)
rag.ingest(["document1.pdf", "notes.txt"])

# Ask questions in natural language
answer = rag.query("What is the recommended dosage for adults over 65?")
print(answer)

# For chat-like conversations
history = []
answer = rag.query("What are the API rate limits for the Pro plan?", history)
history.extend([("You", "What are the API rate limits for the Pro plan?"), ("AI", answer)])
answer = rag.query("Does the Enterprise plan include SSO?", history)

Linux Installation

# Option 1: Just Keovil (Qdrant auto-runs in embedded mode)
pip install git+https://github.com/kaiserkonok/Keovil.git
python -m keovil_web

# Option 2: With local Ollama (if using local LLM)
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5-coder:7b-instruct
pip install git+https://github.com/kaiserkonok/Keovil.git
python -m keovil_web

# Option 3: With external Qdrant (for better performance)
pip install git+https://github.com/kaiserkonok/Keovil.git
curl -L https://github.com/qdrant/qdrant/releases/download/v1.7.4/qdrant-linux-amd64.tar.gz -o qdrant.tar.gz
tar -xzf qdrant.tar.gz
./qdrant &
python -m keovil_web

Features

๐Ÿ—„๏ธ Structured Data Analysis (Web App) Query CSV, Excel, SQLite, Parquet via natural language. Keovil generates and executes SQL via DuckDB.
๐Ÿ“„ Document Q&A Ask questions about PDFs, text files, code. Built on ColBERT retrieval with Qdrant. Docling parses documents locally.
๐Ÿ”„ Automatic Indexing Drop files in a folder โ€” Keovil syncs and indexes them automatically.
๐Ÿ”’ Total Privacy Document processing runs locally. Cloud LLMs are optional - use local Ollama for full privacy.
๐ŸŒ Multi-LLM Support Use Ollama (local), OpenAI, Anthropic, OpenRouter, or Gemini. Change anytime without restart.
โšก Flexible Hardware Full GPU not required - can use cloud LLMs with local document processing.

Supported LLM Providers

Web App

Change providers anytime via Settings page - no restart needed!

Provider Description API Key Required
Ollama Local models running on your machine No
OpenAI GPT-4o, GPT-4o-mini, etc. Yes
Anthropic Claude 3.5 Sonnet, Haiku, etc. Yes
OpenRouter Access 100+ models via single API Yes
Gemini Google Gemini 2.0, 1.5 Pro, etc. Yes

SDK (For Developers)

from keovil import KeovilRAG
from keovil.utils.llm_config import LLMConfig

# Ollama (default)
rag = KeovilRAG(data_dir="/path", llm_config=LLMConfig(provider="ollama", model="qwen2.5-coder:7b"))

# OpenAI
config = LLMConfig(provider="openai", model="gpt-4o", openai_api_key="sk-...")
rag = KeovilRAG(data_dir="/path", llm_config=config)

# Anthropic
config = LLMConfig(provider="anthropic", model="claude-3-5-sonnet-20241022", anthropic_api_key="sk-ant-...")
rag = KeovilRAG(data_dir="/path", llm_config=config)

# OpenRouter
config = LLMConfig(provider="openrouter", model="openai/gpt-4o-mini", openrouter_api_key="sk-or-...")
rag = KeovilRAG(data_dir="/path", llm_config=config)

# Gemini
config = LLMConfig(provider="gemini", model="gemini-2.0-flash", gemini_api_key="AIza...")
rag = KeovilRAG(data_dir="/path", llm_config=config)

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                              KEOVIL INTERFACE                                โ”‚
โ”‚                    Flask Web UI (localhost:5000)                            โ”‚
โ”‚                    โ””โ”€โ”€ Settings: Switch LLM providers                       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                โ”‚
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ–ผ                 โ–ผ                 โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚    KNOWLEDGE LAB    โ”‚ โ”‚   STRUCTURED DATA   โ”‚ โ”‚   CMS EXPLORER      โ”‚
โ”‚   (Document Q&A)    โ”‚ โ”‚       LAB           โ”‚ โ”‚   (File Manager)    โ”‚
โ”‚                     โ”‚ โ”‚   (SQL Queries)      โ”‚ โ”‚                     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚                        โ”‚                        โ”‚
           โ–ผ                        โ–ผ                        โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”            โ”‚
โ”‚    COLBERT ENGINE   โ”‚ โ”‚   SQL QUERY AGENT   โ”‚            โ”‚
โ”‚   (Qdrant + Torch)  โ”‚ โ”‚     (DuckDB)        โ”‚            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜            โ”‚
           โ”‚                        โ”‚                        โ”‚
           โ–ผ                        โ–ผ                        โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”            โ”‚
โ”‚       QDRANT       โ”‚ โ”‚       DUCKDB        โ”‚            โ”‚
โ”‚  Vector Database   โ”‚ โ”‚  SQL Engine (GPU)   โ”‚            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜            โ”‚
           โ”‚                        โ”‚                        โ”‚
           โ–ผ                        โ–ผ                        โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        LLM PROVIDERS                         โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚   โ”‚ Ollama  โ”‚ โ”‚ OpenAI  โ”‚ โ”‚Anthropicโ”‚ โ”‚OpenRouterโ”‚ โ”‚ Gemini โ”‚ โ”‚
โ”‚   โ”‚ (Local) โ”‚ โ”‚  GPT-4  โ”‚ โ”‚Claude-3 โ”‚ โ”‚ 100+ LMs โ”‚ โ”‚Gemini  โ”‚ โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚              Dynamic switching - no restart needed!          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Data Flow

DOCUMENTS                          STRUCTURED DATA (Web App Only)
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€                          โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

User drops files โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ User drops files
     โ”‚                                    โ”‚
     โ–ผ                                    โ–ผ
DocumentProcessor โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ FileWatcher
(Extract text)                     (Monitor folder)
     โ”‚                                    โ”‚
     โ–ผ                                    โ–ผ
IntelligentChunker                  Auto-Sync to
(Smart splitting)                   DuckDB Views
     โ”‚                                    โ”‚
     โ–ผ                                    โ–ผ
ColBERT Embeddings                  SQL Tables
(Torch GPU)                         (CSV/XLSX/etc)
     โ”‚                                    โ”‚
     โ–ผ                                    โ–ผ
Qdrant VectorDB โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ DuckDB
(Index & store)                    (Execute SQL)
     โ”‚                                    โ”‚
     โ”‚                                    โ”‚
     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ”‚
                    โ–ผ
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ”‚   LLM    โ”‚
              โ”‚(Any Prov)โ”‚
              โ””โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜
                    โ”‚
                    โ–ผ
              User Answer

Requirements

Hardware

With Cloud LLM (OpenAI, Anthropic, Gemini, OpenRouter)

Component Minimum Recommended
GPU VRAM 6GB (RTX 3060) 8GB+ (RTX 4060/4070)
RAM 16GB 32GB

Cloud LLMs don't use local GPU - GPU only needed for Docling (document parsing) + ColBERT (embeddings). 6GB minimum tested and working; 8GB recommended.

With Local LLM (Ollama)

Component Minimum Recommended
GPU VRAM 10GB+ 16GB+
RAM 16GB 32GB

Local Ollama + ColBERT + Docling all need GPU. 10GB minimum for 7B models + embeddings. 16GB recommended for smooth operation.

No GPU (CPU Only)

Component Minimum
RAM 16GB

Works but extremely slow. Document ingestion minutes vs seconds. Not recommended for production use.

Software

Dependency Version Purpose
Python 3.12+ Runtime
CUDA 12.4+ (12.8 for RTX 50) GPU acceleration (if using GPU)
Ollama Latest Local LLM (only if using local)
Qdrant Optional Vector database (auto-embedded if not running)

No Docker needed! Qdrant runs automatically in embedded mode if no external server is available. Everything works out of the box.

No GPU? Works with cloud LLMs only. Document processing will be slower but functional.


Configuration

Environment Variables

Variable Default Description
KEOVIL_PROVIDER ollama LLM provider (ollama, openai, anthropic, openrouter, gemini)
KEOVIL_MODEL provider-specific Model name
OLLAMA_HOST 127.0.0.1:11434 Ollama server address
OPENAI_API_KEY - OpenAI API key
ANTHROPIC_API_KEY - Anthropic API key
OPENROUTER_API_KEY - OpenRouter API key
GEMINI_API_KEY - Google Gemini API key
QDRANT_HOST localhost Qdrant server address
STORAGE_BASE ~/.keovil Custom storage path

Storage Locations

~/.keovil/                 โ† Default storage for both SDK and web app
โ”œโ”€โ”€ data/                  # Source files
โ”œโ”€โ”€ database/              # SQLite manifest + chat history
โ”œโ”€โ”€ config.json            # LLM provider settings
โ””โ”€โ”€ qdrant/               # Vector embeddings

Usage Modes

Web Application

# After pip install, run:
python -m keovil_web
  • Uses ~/.keovil for storage
  • Collection: keovil_app
  • Visit http://localhost:5000
  • Go to Settings to change LLM provider anytime

SDK (For Developers)

from keovil import KeovilRAG

rag = KeovilRAG(data_dir="/path/to/files")
rag.ingest(["file.pdf"])
answer = rag.query("Your question?")
  • Uses ~/.keovil for storage
  • Collection: keovil
  • Pass llm_config to use different providers

Custom Storage

# Override storage location
export STORAGE_BASE=/path/to/custom/storage
python -m keovil_web

Troubleshooting

Ollama Issues

# Check installation
ollama --version

# Check running
ollama serve

# Pull model
ollama pull qwen2.5-coder:7b-instruct

# List models
ollama list

Qdrant Issues

# Check health
curl http://localhost:6333/healthz

# Response should be: {"status":"ok"}

GPU Issues

# Check NVIDIA driver
nvidia-smi

# Check PyTorch CUDA
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"

# Check VRAM
nvidia-smi --query-gpu=memory.used,memory.total --format=csv

License

GNU General Public License v3.0 โ€” see LICENSE for details.


Star ยท Issues ยท Website

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keovil-0.1.0.tar.gz (253.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

keovil-0.1.0-py3-none-any.whl (259.9 kB view details)

Uploaded Python 3

File details

Details for the file keovil-0.1.0.tar.gz.

File metadata

  • Download URL: keovil-0.1.0.tar.gz
  • Upload date:
  • Size: 253.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for keovil-0.1.0.tar.gz
Algorithm Hash digest
SHA256 de04f680fbc0e69ac5312a9e1d32cf087c3faf5399f8833fd077fd8544c917fa
MD5 8e4814532bf412b717124d938535ee9f
BLAKE2b-256 45eded8075803c01f00c2248a6c653b480324d635dbd290f4658a6a56a8f409d

See more details on using hashes here.

File details

Details for the file keovil-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: keovil-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 259.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for keovil-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 976fdfab9bb6ce236d537e21659c4ece6e0dcff1c976e00c0cf46c05c17c0a03
MD5 3383c6132499bea1473edaf7b075d2eb
BLAKE2b-256 f1f5a869ede9fa6675172807173fbfa59635e4a72a4bced2d685773a501f78de

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page