A command-line RAG tool for querying documents using LlamaIndex with OpenAI and Ollama support
Project description
RAG Box
A powerful command-line tool for querying documents using RAG (Retrieval Augmented Generation). Built with LlamaIndex and supports both OpenAI and Ollama models.
Features
- ๐ Dual Provider Support: Use OpenAI (cloud) or Ollama (local) for LLM and embeddings
- ๐ Smart Document Indexing: Automatically indexes documents with configurable embeddings
- ๐ฌ Interactive & Direct Modes: Single questions or continuous conversations
- ๐ Streaming Responses: Real-time token streaming for faster perceived responses
- ๐พ Persistent Vector Store: Reuses embeddings for instant subsequent queries
- ๐ฏ Auto-Rebuild Detection: Automatically rebuilds index when embedding config changes
- ๐ Source Attribution: Shows which documents contributed to each answer
- โ๏ธ Flexible Configuration: JSON config files, environment variables, or CLI args
- ๐จ Beautiful Output: Formatted boxes with proper text wrapping
Installation
From PyPI (once published)
pip install ragbox
From Source
git clone https://github.com/praysimanjuntak/ragbox.git
cd ragbox
pip install -e .
Quick Start
Using OpenAI (Recommended for Best Quality)
# Set your API key
export OPENAI_API_KEY='sk-...'
# Ask a question
ragbox "What is this project about?"
Using Ollama (Local, No API Key Needed)
Prerequisites: Ollama server must be running with models pulled
# 1. Install and start Ollama server
# Download from https://ollama.ai
ollama serve
# 2. Pull models (required before using)
# Embedding model (choose one):
ollama pull embeddinggemma # Recommended for embeddings
# LLM model (choose any model you prefer):
ollama pull granite4:micro # Fast, 500MB
ollama pull llama3.2:3b # More capable, 2GB
ollama pull qwen2.5:7b # High quality, 4.7GB
ollama pull mistral:latest # Or any other model you prefer
# 3. Create and configure
ragbox --init
# 4. Edit .rag_config.json to use ollama provider
# Set "llm_model" to any model you've pulled (e.g., "llama3.2:3b")
# 5. Ask questions
ragbox "What is this project about?"
Note: Replace granite4:micro with any Ollama model you've pulled. See available models at https://ollama.ai/library
Usage
Basic Commands
# Ask a single question
ragbox "What is the main purpose of this codebase?"
# Interactive mode
ragbox
# Specify documents directory
ragbox -d /path/to/docs "Summarize the key features"
# Force rebuild index
ragbox --rebuild
# List indexed files
ragbox --list-files
# Verbose output (show timing and config)
ragbox --verbose "question"
# Plain text output (easy to copy)
ragbox --format copy "question"
Command-Line Options
usage: ragbox [-h] [-d DOCS_DIR] [-s STORAGE_DIR] [-m MODEL] [--rebuild]
[--list-files] [-v] [--format {box,copy}] [--init] [question]
positional arguments:
question Question to ask (if not provided, enters interactive mode)
options:
-h, --help show this help message and exit
-d DOCS_DIR, --docs-dir DOCS_DIR
Directory containing documents (default: current directory)
-s STORAGE_DIR, --storage-dir STORAGE_DIR
Directory for storing index (default: .storage)
-m MODEL, --model MODEL
LLM model to use
--rebuild Force rebuild of index
--list-files List indexed files and exit
-v, --verbose Show detailed initialization logs
--format {box,copy} Output format: 'box' (default) or 'copy' (plain text)
--init Create a default .rag_config.json file
Configuration
Configuration File
Create a .rag_config.json file in your project directory:
ragbox --init
Example configuration:
{
"embedding_model": "text-embedding-3-small",
"embedding_provider": "openai",
"embedding_base_url": "http://localhost:11434",
"embedding_dimensions": 1536,
"llm_model": "gpt-4o-mini",
"llm_provider": "openai",
"request_timeout": 360,
"context_window": 32000,
"chat_mode": "context",
"streaming": true,
"system_prompt": "You are a helpful assistant that analyzes documents and answers questions based on the provided context. Always cite relevant information from the documents."
}
Provider Options
OpenAI (Recommended)
{
"embedding_provider": "openai",
"embedding_model": "text-embedding-3-small",
"llm_provider": "openai",
"llm_model": "gpt-4o-mini"
}
Ollama (Local)
{
"embedding_provider": "ollama",
"embedding_model": "embeddinggemma",
"llm_provider": "ollama",
"llm_model": "granite4:micro"
}
Note: You can use any Ollama model for llm_model - just replace granite4:micro with any model you've pulled (e.g., llama3.2:3b, mistral:latest, qwen2.5:7b, etc.). See all available models at https://ollama.ai/library
Mix and Match Providers
You can use different providers for embeddings and LLM:
{
"embedding_provider": "ollama",
"embedding_model": "embeddinggemma",
"llm_provider": "openai",
"llm_model": "gpt-4o-mini"
}
Switching Between Providers
To switch from Ollama to OpenAI (or vice versa):
-
Edit
.rag_config.jsonand update the provider settings:{ "embedding_provider": "openai", "embedding_model": "text-embedding-3-small", "llm_provider": "openai", "llm_model": "gpt-4o-mini" }
-
Set your API key (for OpenAI):
export OPENAI_API_KEY='sk-...'
-
Rebuild the index (required when changing embedding provider):
ragbox --rebuild
Note: When you change the embedding provider or model, the index will automatically rebuild on the next run.
Environment Variables
# Required for OpenAI
export OPENAI_API_KEY='sk-...'
# Optional overrides
export RAG_EMBEDDING_MODEL="text-embedding-3-small"
export RAG_EMBEDDING_PROVIDER="openai"
export RAG_LLM_MODEL="gpt-4o-mini"
export RAG_LLM_PROVIDER="openai"
export RAG_CHAT_MODE="context"
export RAG_STREAMING="true"
How It Works
- First Run: Loads documents โ Creates embeddings โ Saves vector index
- Subsequent Runs: Loads existing index (instant startup)
- Config Change Detection: Automatically rebuilds if embedding config changes
- Query Processing:
- Embeds your question
- Retrieves relevant document chunks
- Sends context + question to LLM
- Streams back the answer with sources
Supported File Types
- Text files:
.txt,.md,.rst - Code files:
.py,.js,.java,.cpp,.go,.ts,.html,.css,.sh, etc. - Documents:
.pdf,.docx,.epub,.ppt,.pptx,.pptm - Data files:
.csv,.json,.yaml,.xml - Notebooks:
.ipynb(Jupyter Notebooks) - Images:
.png,.jpg,.jpeg(with OCR/vision capabilities) - Media:
.mp3,.mp4(audio/video transcription) - Email:
.mbox(email archives) - Other:
.hwp(Hangul Word Processor)
All files are processed via LlamaIndex's SimpleDirectoryReader, which automatically detects file types and uses appropriate parsers.
Auto-Excluded
.storage,.git,.venv,venv,node_modules__pycache__,.pytest_cache,.mypy_cache*.log,*.pyc
Examples
Analyze a Codebase
cd my-project
ragbox "Explain the authentication flow"
ragbox "Are there any security issues?"
ragbox "How is the database configured?"
Query CCTV/Security Logs
Perfect for analyzing large log files quickly:
# Point to your logs directory
cd /var/log/security
ragbox "Show me all failed login attempts from yesterday"
ragbox "Were there any suspicious access patterns?"
ragbox "Summarize the security events from IP 192.168.1.100"
# Or specify the directory
ragbox -d /var/log/cctv "When did motion detection trigger last night?"
ragbox -d /var/log/cctv "List all events between 10pm and 6am"
How it works: RAG CLI indexes all log files, allowing you to ask natural language questions instead of manually searching through thousands of lines. The AI retrieves relevant log entries and provides contextual answers.
Research Papers
ragbox -d ~/papers "Compare the methodologies"
ragbox -d ~/papers "What are the main findings?"
Documentation
ragbox -d ./docs "How do I setup the project?"
ragbox -d ./docs --rebuild # After updating docs
Interactive Session
$ ragbox
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ฌ RAG Box - Interactive Mode
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ Documents: /home/user/docs
๐ Indexed files: 42
๐ค Model: gpt-4o-mini
Commands:
/exit, /quit - Exit the program
/files - List indexed files
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โฏ What are the main topics covered?
[Answer with sources...]
โฏ Tell me more about topic X
[Continued conversation with context...]
โฏ /exit
๐ Goodbye!
Troubleshooting
OpenAI API Key Issues
# Verify key is set
echo $OPENAI_API_KEY
# Set the key
export OPENAI_API_KEY='sk-...'
Ollama Connection Issues
Important: Make sure Ollama server is running and required models are pulled!
# Start Ollama server (required!)
ollama serve
# Check what's running
ollama ps
# Pull required models if not already pulled
ollama pull embeddinggemma # For embeddings
ollama pull granite4:micro # For LLM (or any other model)
# Verify models are available
ollama list
# Keep models loaded in memory for faster response
ollama run granite4:micro
# Press Ctrl+D to exit but keep loaded
Common issues:
- "Connection refused": Ollama server not running โ Run
ollama serve - "Model not found": Models not pulled โ Run
ollama pull <model-name> - Slow responses: Models loading from disk โ Keep them loaded with
ollama run <model>
Index/Embedding Mismatch
Don't worry! The tool automatically detects config changes and rebuilds:
# Or force rebuild manually
ragbox --rebuild
No Documents Found
# Check current directory
ls -la
# Specify directory explicitly
ragbox -d /path/to/docs "question"
Performance Tips
- OpenAI: Faster embeddings, better quality, requires API key
- Ollama: Free, local, but slower (keep models loaded:
ollama run model) - Index Reuse: First run is slow (embedding creation), subsequent runs are instant
- Streaming: Enabled by default for faster perceived response
- Model Choice: Smaller models (granite4:micro) faster, larger models (gpt-4o) better quality
Development
git clone https://github.com/praysimanjuntak/ragbox.git
cd ragbox
pip install -e ".[dev]"
Roadmap
- OpenAI and Ollama support
- Streaming responses
- Auto-rebuild on config change
- Source attribution
- Interactive mode
- Configurable output format
- Image document support with OCR (planned for next update)
- Web search integration
- Conversation history export
- Query caching
License
MIT License
Acknowledgments
- LlamaIndex - RAG framework
- OpenAI - Cloud embeddings and LLMs
- Ollama - Local LLM runtime
Author: Pray Apostel Simanjuntak
If you find this project helpful, please consider giving it a โญ on GitHub!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ragbox-0.1.1.tar.gz.
File metadata
- Download URL: ragbox-0.1.1.tar.gz
- Upload date:
- Size: 20.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5569d16ea2c4d61f2a765965cbde5e234ba6ce35008d9e9873dd6085db11c6d1
|
|
| MD5 |
5259613143c5fb5974e4146c05a3ec0f
|
|
| BLAKE2b-256 |
426f9bceef93d469faee8cc16b99d5ac9c0e4e8bab808ac4c4b2062afc0ec814
|
File details
Details for the file ragbox-0.1.1-py3-none-any.whl.
File metadata
- Download URL: ragbox-0.1.1-py3-none-any.whl
- Upload date:
- Size: 17.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7dced66b33692b50440adfac0cab32c8c8be34e438510395cb69c3888c035ac7
|
|
| MD5 |
ff47556d2f2a4bd8df8d264c6d1398fd
|
|
| BLAKE2b-256 |
971578079ee8fc12ae7cf63df1899faf74e92f85513bf2e1cb318572c2b53a11
|