Private RAG System for Documents & Structured Data
Project description
Keovil
Private Query Interface for Documents & Structured Data
Ask questions in plain English. Keovil queries your files.
Upload PDFs, text files, code, and more. Or query CSV, Excel, SQLite databases via the web app. Keovil generates the queries and returns results. Document processing runs locally on your GPU, with flexible LLM options (cloud or local).
Installation
Option 1: Install from PyPI (Recommended)
pip install keovil
Option 2: Install from GitHub (Latest Development)
pip install git+https://github.com/kaiserkonok/Keovil.git
Option 3: Local Development Install
# Clone the repository
git clone https://github.com/kaiserkonok/Keovil.git
cd Keovil
# Install in development mode
pip install -e .
Quick Start
# 1. Install Keovil
pip install keovil
# 2. Run the web app (Qdrant runs automatically!)
python -m keovil_web
That's it! Qdrant runs in embedded mode automatically.
Optional: Run Qdrant Manually (For Better Performance)
If you want to run Qdrant externally (e.g., via Docker) for better performance:
# macOS
brew install qdrant && brew services start qdrant
# Linux (Docker)
docker run -d -p 6333:6333 -v qdrant_storage:/qdrant_storage qdrant/qdrant
Then set QDRANT_HOST=your-server if not on localhost.
Quick Start
Web App
# 1. Install Keovil
pip install keovil
# 2. Run the web app (Qdrant runs automatically!)
python -m keovil_web
That's it! Qdrant runs in embedded mode automatically. No Docker needed.
For local LLM, also install Ollama:
curl -fsSL https://ollama.com/install.sh | sh
SDK (For Developers)
from keovil import KeovilRAG
from keovil.utils.llm_config import LLMConfig
# Initialize with default Ollama
rag = KeovilRAG(data_dir="/path/to/your/files")
# Or use a specific LLM provider
config = LLMConfig(provider="openai", model="gpt-4o", openai_api_key="sk-...")
rag = KeovilRAG(data_dir="/path/to/files", llm_config=config)
# Index your files (PDF, text, code, etc.)
rag.ingest(["document1.pdf", "notes.txt"])
# Ask questions in natural language
answer = rag.query("What is the recommended dosage for adults over 65?")
print(answer)
# For chat-like conversations
history = []
answer = rag.query("What are the API rate limits for the Pro plan?", history)
history.extend([("You", "What are the API rate limits for the Pro plan?"), ("AI", answer)])
answer = rag.query("Does the Enterprise plan include SSO?", history)
Linux Installation
# Option 1: Just Keovil (Qdrant auto-runs in embedded mode)
pip install keovil
python -m keovil_web
# Option 2: With local Ollama (if using local LLM)
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5-coder:7b-instruct
pip install keovil
python -m keovil_web
# Option 3: With external Qdrant (for better performance)
pip install keovil
curl -L https://github.com/qdrant/qdrant/releases/download/v1.7.4/qdrant-linux-amd64.tar.gz -o qdrant.tar.gz
tar -xzf qdrant.tar.gz
./qdrant &
python -m keovil_web
Features
| ๐๏ธ Structured Data Analysis (Web App) | Query CSV, Excel, SQLite, Parquet via natural language. Keovil generates and executes SQL via DuckDB. |
| ๐ Document Q&A | Ask questions about PDFs, text files, code. Built on ColBERT retrieval with Qdrant. Docling parses documents locally. |
| ๐ Automatic Indexing | Drop files in a folder โ Keovil syncs and indexes them automatically. |
| ๐ Total Privacy | Document processing runs locally. Cloud LLMs are optional - use local Ollama for full privacy. |
| ๐ Multi-LLM Support | Use Ollama (local), OpenAI, Anthropic, OpenRouter, or Gemini. Change anytime without restart. |
| โก Flexible Hardware | Full GPU not required - can use cloud LLMs with local document processing. |
Supported LLM Providers
Web App
Change providers anytime via Settings page - no restart needed!
| Provider | Description | API Key Required |
|---|---|---|
| Ollama | Local models running on your machine | No |
| OpenAI | GPT-4o, GPT-4o-mini, etc. | Yes |
| Anthropic | Claude 3.5 Sonnet, Haiku, etc. | Yes |
| OpenRouter | Access 100+ models via single API | Yes |
| Gemini | Google Gemini 2.0, 1.5 Pro, etc. | Yes |
SDK (For Developers)
from keovil import KeovilRAG
from keovil.utils.llm_config import LLMConfig
# Ollama (default)
rag = KeovilRAG(data_dir="/path", llm_config=LLMConfig(provider="ollama", model="qwen2.5-coder:7b"))
# OpenAI
config = LLMConfig(provider="openai", model="gpt-4o", openai_api_key="sk-...")
rag = KeovilRAG(data_dir="/path", llm_config=config)
# Anthropic
config = LLMConfig(provider="anthropic", model="claude-3-5-sonnet-20241022", anthropic_api_key="sk-ant-...")
rag = KeovilRAG(data_dir="/path", llm_config=config)
# OpenRouter
config = LLMConfig(provider="openrouter", model="openai/gpt-4o-mini", openrouter_api_key="sk-or-...")
rag = KeovilRAG(data_dir="/path", llm_config=config)
# Gemini
config = LLMConfig(provider="gemini", model="gemini-2.0-flash", gemini_api_key="AIza...")
rag = KeovilRAG(data_dir="/path", llm_config=config)
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ KEOVIL INTERFACE โ
โ Flask Web UI (localhost:5000) โ
โ โโโ Settings: Switch LLM providers โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโ
โผ โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ
โ KNOWLEDGE LAB โ โ STRUCTURED DATA โ โ CMS EXPLORER โ
โ (Document Q&A) โ โ LAB โ โ (File Manager) โ
โ โ โ (SQL Queries) โ โ โ
โโโโโโโโโโโโฌโโโโโโโโโโโ โโโโโโโโโโโโฌโโโโโโโโโโโ โโโโโโโโโโโโฌโโโโโโโโโโโ
โ โ โ
โผ โผ โ
โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ COLBERT ENGINE โ โ SQL QUERY AGENT โ โ
โ (Qdrant + Torch) โ โ (DuckDB) โ โ
โโโโโโโโโโโโฌโโโโโโโโโโโ โโโโโโโโโโโโฌโโโโโโโโโโโ โ
โ โ โ
โผ โผ โ
โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ QDRANT โ โ DUCKDB โ โ
โ Vector Database โ โ SQL Engine (GPU) โ โ
โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โผ โผ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ LLM PROVIDERS โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโ โ
โ โ Ollama โ โ OpenAI โ โAnthropicโ โOpenRouterโ โ Gemini โ โ
โ โ (Local) โ โ GPT-4 โ โClaude-3 โ โ 100+ LMs โ โGemini โ โ
โ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโโ โโโโโโโโโโ โ
โ Dynamic switching - no restart needed! โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Data Flow
DOCUMENTS STRUCTURED DATA (Web App Only)
โโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
User drops files โโโโโโโโโโโโโโโโโโ User drops files
โ โ
โผ โผ
DocumentProcessor โโโโโโโโโโโโโโโโโโ FileWatcher
(Extract text) (Monitor folder)
โ โ
โผ โผ
IntelligentChunker Auto-Sync to
(Smart splitting) DuckDB Views
โ โ
โผ โผ
ColBERT Embeddings SQL Tables
(Torch GPU) (CSV/XLSX/etc)
โ โ
โผ โผ
Qdrant VectorDB โโโโโโโโโโโโโโโโโโ DuckDB
(Index & store) (Execute SQL)
โ โ
โ โ
โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโ
โ LLM โ
โ(Any Prov)โ
โโโโโโโฌโโโโโโ
โ
โผ
User Answer
Requirements
Hardware
With Cloud LLM (OpenAI, Anthropic, Gemini, OpenRouter)
| Component | Minimum | Recommended |
|---|---|---|
| GPU VRAM | 6GB (RTX 3060) | 8GB+ (RTX 4060/4070) |
| RAM | 16GB | 32GB |
Cloud LLMs don't use local GPU - GPU only needed for Docling (document parsing) + ColBERT (embeddings). 6GB minimum tested and working; 8GB recommended.
With Local LLM (Ollama)
| Component | Minimum | Recommended |
|---|---|---|
| GPU VRAM | 10GB+ | 16GB+ |
| RAM | 16GB | 32GB |
Local Ollama + ColBERT + Docling all need GPU. 10GB minimum for 7B models + embeddings. 16GB recommended for smooth operation.
No GPU (CPU Only)
| Component | Minimum |
|---|---|
| RAM | 16GB |
Works but extremely slow. Document ingestion minutes vs seconds. Not recommended for production use.
Software
| Dependency | Version | Purpose |
|---|---|---|
| Python | 3.12+ | Runtime |
| CUDA | 12.4+ (12.8 for RTX 50) | GPU acceleration (if using GPU) |
| Ollama | Latest | Local LLM (only if using local) |
| Qdrant | Optional | Vector database (auto-embedded if not running) |
No Docker needed! Qdrant runs automatically in embedded mode if no external server is available. Everything works out of the box.
No GPU? Works with cloud LLMs only. Document processing will be slower but functional.
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
KEOVIL_PROVIDER |
ollama |
LLM provider (ollama, openai, anthropic, openrouter, gemini) |
KEOVIL_MODEL |
provider-specific | Model name |
OLLAMA_HOST |
127.0.0.1:11434 |
Ollama server address |
OPENAI_API_KEY |
- | OpenAI API key |
ANTHROPIC_API_KEY |
- | Anthropic API key |
OPENROUTER_API_KEY |
- | OpenRouter API key |
GEMINI_API_KEY |
- | Google Gemini API key |
QDRANT_HOST |
localhost |
Qdrant server address |
STORAGE_BASE |
~/.keovil |
Custom storage path |
Storage Locations
~/.keovil/ โ Default storage for both SDK and web app
โโโ data/ # Source files
โโโ database/ # SQLite manifest + chat history
โโโ config.json # LLM provider settings
โโโ qdrant/ # Vector embeddings
Usage Modes
Web Application
# After pip install, run:
python -m keovil_web
- Uses
~/.keovilfor storage - Collection:
keovil_app - Visit http://localhost:5000
- Go to Settings to change LLM provider anytime
SDK (For Developers)
from keovil import KeovilRAG
rag = KeovilRAG(data_dir="/path/to/files")
rag.ingest(["file.pdf"])
answer = rag.query("Your question?")
- Uses
~/.keovilfor storage - Collection:
keovil - Pass
llm_configto use different providers
Custom Storage
# Override storage location
export STORAGE_BASE=/path/to/custom/storage
python -m keovil_web
Troubleshooting
Ollama Issues
# Check installation
ollama --version
# Check running
ollama serve
# Pull model
ollama pull qwen2.5-coder:7b-instruct
# List models
ollama list
Qdrant Issues
# Check health
curl http://localhost:6333/healthz
# Response should be: {"status":"ok"}
GPU Issues
# Check NVIDIA driver
nvidia-smi
# Check PyTorch CUDA
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"
# Check VRAM
nvidia-smi --query-gpu=memory.used,memory.total --format=csv
License
GNU General Public License v3.0 โ see LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file keovil-0.1.7-py3-none-any.whl.
File metadata
- Download URL: keovil-0.1.7-py3-none-any.whl
- Upload date:
- Size: 260.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca99bfbccb8fe4a0ec158b400dc3c82d4c311b2ff95443e0fe07054f063a50ec
|
|
| MD5 |
21aa8aa3d7892a26195b61344628fa2c
|
|
| BLAKE2b-256 |
3bcf8415e3b9aa9ff7032103c491c0b52dd1e8ad6c6c840bfca8f27d3a80086b
|