A command-line RAG tool for querying documents using LlamaIndex with OpenAI and Ollama support

These details have not been verified by PyPI

Project links

Project description

RAG Box

A powerful command-line tool for querying documents using RAG (Retrieval Augmented Generation). Built with LlamaIndex and supports both OpenAI and Ollama models.

Features

🚀 Dual Provider Support: Use OpenAI (cloud) or Ollama (local) for LLM and embeddings
📚 Smart Document Indexing: Automatically indexes documents with configurable embeddings
💬 Interactive & Direct Modes: Single questions or continuous conversations
🔄 Streaming Responses: Real-time token streaming for faster perceived responses
💾 Persistent Vector Store: Reuses embeddings for instant subsequent queries
🎯 Auto-Rebuild Detection: Automatically rebuilds index when embedding config changes
📊 Source Attribution: Shows which documents contributed to each answer
⚙️ Flexible Configuration: JSON config files, environment variables, or CLI args
🎨 Beautiful Output: Formatted boxes with proper text wrapping

Installation

From PyPI (once published)

pip install ragbox

From Source

git clone https://github.com/praysimanjuntak/ragbox.git
cd ragbox
pip install -e .

Quick Start

Using OpenAI (Recommended for Best Quality)

# Set your API key
export OPENAI_API_KEY='sk-...'

# Ask a question
ragbox "What is this project about?"

Using Ollama (Local, No API Key Needed)

Prerequisites: Ollama server must be running with models pulled

# 1. Install and start Ollama server
# Download from https://ollama.ai
ollama serve

# 2. Pull models (required before using)
# Embedding model (choose one):
ollama pull embeddinggemma          # Recommended for embeddings

# LLM model (choose any model you prefer):
ollama pull granite4:micro          # Fast, 500MB
ollama pull llama3.2:3b            # More capable, 2GB
ollama pull qwen2.5:7b             # High quality, 4.7GB
ollama pull mistral:latest         # Or any other model you prefer

# 3. Create and configure
ragbox --init

# 4. Edit .rag_config.json to use ollama provider
# Set "llm_model" to any model you've pulled (e.g., "llama3.2:3b")
# 5. Ask questions
ragbox "What is this project about?"

Note: Replace granite4:micro with any Ollama model you've pulled. See available models at https://ollama.ai/library

Usage

Basic Commands

# Ask a single question
ragbox "What is the main purpose of this codebase?"

# Interactive mode
ragbox

# Specify documents directory
ragbox -d /path/to/docs "Summarize the key features"

# Force rebuild index
ragbox --rebuild

# List indexed files
ragbox --list-files

# Verbose output (show timing and config)
ragbox --verbose "question"

# Plain text output (easy to copy)
ragbox --format copy "question"

Command-Line Options

usage: ragbox [-h] [-d DOCS_DIR] [-s STORAGE_DIR] [-m MODEL] [--rebuild]
              [--list-files] [-v] [--format {box,copy}] [--init] [question]

positional arguments:
  question              Question to ask (if not provided, enters interactive mode)

options:
  -h, --help            show this help message and exit
  -d DOCS_DIR, --docs-dir DOCS_DIR
                        Directory containing documents (default: current directory)
  -s STORAGE_DIR, --storage-dir STORAGE_DIR
                        Directory for storing index (default: .storage)
  -m MODEL, --model MODEL
                        LLM model to use
  --rebuild             Force rebuild of index
  --list-files          List indexed files and exit
  -v, --verbose         Show detailed initialization logs
  --format {box,copy}   Output format: 'box' (default) or 'copy' (plain text)
  --init                Create a default .rag_config.json file

Configuration

Configuration File

Create a .rag_config.json file in your project directory:

ragbox --init

Example configuration:

{
  "embedding_model": "text-embedding-3-small",
  "embedding_provider": "openai",
  "embedding_base_url": "http://localhost:11434",
  "embedding_dimensions": 1536,
  "llm_model": "gpt-4o-mini",
  "llm_provider": "openai",
  "request_timeout": 360,
  "context_window": 32000,
  "chat_mode": "context",
  "streaming": true,
  "system_prompt": "You are a helpful assistant that analyzes documents and answers questions based on the provided context. Always cite relevant information from the documents."
}

Provider Options

OpenAI (Recommended)

{
  "embedding_provider": "openai",
  "embedding_model": "text-embedding-3-small",
  "llm_provider": "openai",
  "llm_model": "gpt-4o-mini"
}

Ollama (Local)

{
  "embedding_provider": "ollama",
  "embedding_model": "embeddinggemma",
  "llm_provider": "ollama",
  "llm_model": "granite4:micro"
}

Note: You can use any Ollama model for llm_model - just replace granite4:micro with any model you've pulled (e.g., llama3.2:3b, mistral:latest, qwen2.5:7b, etc.). See all available models at https://ollama.ai/library

Mix and Match Providers

You can use different providers for embeddings and LLM:

{
  "embedding_provider": "ollama",
  "embedding_model": "embeddinggemma",
  "llm_provider": "openai",
  "llm_model": "gpt-4o-mini"
}

Switching Between Providers

To switch from Ollama to OpenAI (or vice versa):

Edit .rag_config.json and update the provider settings:

{
  "embedding_provider": "openai",
  "embedding_model": "text-embedding-3-small",
  "llm_provider": "openai",
  "llm_model": "gpt-4o-mini"
}

Set your API key (for OpenAI):
```
export OPENAI_API_KEY='sk-...'
```
Rebuild the index (required when changing embedding provider):
```
ragbox --rebuild
```

Note: When you change the embedding provider or model, the index will automatically rebuild on the next run.

Environment Variables

# Required for OpenAI
export OPENAI_API_KEY='sk-...'

# Optional overrides
export RAG_EMBEDDING_MODEL="text-embedding-3-small"
export RAG_EMBEDDING_PROVIDER="openai"
export RAG_LLM_MODEL="gpt-4o-mini"
export RAG_LLM_PROVIDER="openai"
export RAG_CHAT_MODE="context"
export RAG_STREAMING="true"

How It Works

First Run: Loads documents → Creates embeddings → Saves vector index
Subsequent Runs: Loads existing index (instant startup)
Config Change Detection: Automatically rebuilds if embedding config changes
Query Processing:
- Embeds your question
- Retrieves relevant document chunks
- Sends context + question to LLM
- Streams back the answer with sources

Supported File Types

Text files: .txt, .md, .rst
Code files: .py, .js, .java, .cpp, .go, .ts, .html, .css, .sh, etc.
Documents: .pdf, .docx, .epub, .ppt, .pptx, .pptm
Data files: .csv, .json, .yaml, .xml
Notebooks: .ipynb (Jupyter Notebooks)
Images: .png, .jpg, .jpeg (with OCR/vision capabilities)
Media: .mp3, .mp4 (audio/video transcription)
Email: .mbox (email archives)
Other: .hwp (Hangul Word Processor)

All files are processed via LlamaIndex's SimpleDirectoryReader, which automatically detects file types and uses appropriate parsers.

Auto-Excluded

.storage, .git, .venv, venv, node_modules
__pycache__, .pytest_cache, .mypy_cache
*.log, *.pyc

Examples

Analyze a Codebase

cd my-project
ragbox "Explain the authentication flow"
ragbox "Are there any security issues?"
ragbox "How is the database configured?"

Query CCTV/Security Logs

Perfect for analyzing large log files quickly:

# Point to your logs directory
cd /var/log/security
ragbox "Show me all failed login attempts from yesterday"
ragbox "Were there any suspicious access patterns?"
ragbox "Summarize the security events from IP 192.168.1.100"

# Or specify the directory
ragbox -d /var/log/cctv "When did motion detection trigger last night?"
ragbox -d /var/log/cctv "List all events between 10pm and 6am"

How it works: RAG CLI indexes all log files, allowing you to ask natural language questions instead of manually searching through thousands of lines. The AI retrieves relevant log entries and provides contextual answers.

Research Papers

ragbox -d ~/papers "Compare the methodologies"
ragbox -d ~/papers "What are the main findings?"

Documentation

ragbox -d ./docs "How do I setup the project?"
ragbox -d ./docs --rebuild  # After updating docs

Interactive Session

$ ragbox

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
💬 RAG Box - Interactive Mode
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📁 Documents: /home/user/docs
📚 Indexed files: 42
🤖 Model: gpt-4o-mini

Commands:
  /exit, /quit - Exit the program
  /files - List indexed files
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

❯ What are the main topics covered?
[Answer with sources...]

❯ Tell me more about topic X
[Continued conversation with context...]

❯ /exit
👋 Goodbye!

Publishing to PyPI

Prerequisites

pip install build twine

Build and Upload

# 1. Update version in pyproject.toml
# 2. Build the package
python -m build

# 3. Upload to TestPyPI (optional, for testing)
twine upload --repository testpypi dist/*

# 4. Upload to PyPI
twine upload dist/*

Test Installation

# From TestPyPI
pip install --index-url https://test.pypi.org/simple/ ragbox

# From PyPI
pip install ragbox

Troubleshooting

OpenAI API Key Issues

# Verify key is set
echo $OPENAI_API_KEY

# Set the key
export OPENAI_API_KEY='sk-...'

Ollama Connection Issues

Important: Make sure Ollama server is running and required models are pulled!

# Start Ollama server (required!)
ollama serve

# Check what's running
ollama ps

# Pull required models if not already pulled
ollama pull embeddinggemma    # For embeddings
ollama pull granite4:micro    # For LLM (or any other model)

# Verify models are available
ollama list

# Keep models loaded in memory for faster response
ollama run granite4:micro
# Press Ctrl+D to exit but keep loaded

Common issues:

"Connection refused": Ollama server not running → Run ollama serve
"Model not found": Models not pulled → Run ollama pull <model-name>
Slow responses: Models loading from disk → Keep them loaded with ollama run <model>

Index/Embedding Mismatch

Don't worry! The tool automatically detects config changes and rebuilds:

# Or force rebuild manually
ragbox --rebuild

No Documents Found

# Check current directory
ls -la

# Specify directory explicitly
ragbox -d /path/to/docs "question"

Performance Tips

OpenAI: Faster embeddings, better quality, requires API key
Ollama: Free, local, but slower (keep models loaded: ollama run model)
Index Reuse: First run is slow (embedding creation), subsequent runs are instant
Streaming: Enabled by default for faster perceived response
Model Choice: Smaller models (granite4:micro) faster, larger models (gpt-4o) better quality

Development

git clone https://github.com/praysimanjuntak/ragbox.git
cd ragbox
pip install -e ".[dev]"

Roadmap

OpenAI and Ollama support
Streaming responses
Auto-rebuild on config change
Source attribution
Interactive mode
Configurable output format
Image document support with OCR (planned for next update)
Web search integration
Conversation history export
Query caching

License

MIT License

Acknowledgments

LlamaIndex - RAG framework
OpenAI - Cloud embeddings and LLMs
Ollama - Local LLM runtime

Author: Pray Apostel Simanjuntak

If you find this project helpful, please consider giving it a ⭐ on GitHub!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.1

Oct 13, 2025

This version

0.1.0

Oct 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragbox-0.1.0.tar.gz (20.9 kB view details)

Uploaded Oct 13, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragbox-0.1.0-py3-none-any.whl (17.4 kB view details)

Uploaded Oct 13, 2025 Python 3

File details

Details for the file ragbox-0.1.0.tar.gz.

File metadata

Download URL: ragbox-0.1.0.tar.gz
Upload date: Oct 13, 2025
Size: 20.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for ragbox-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`598a8abb7624e3b54875d17d82dd9f4dc28e3e3bd57140bfc5fd11d9010f1fa0`
MD5	`dedea24a69a2e27d249d8ecbd809115d`
BLAKE2b-256	`7ce1f8f9eca2510351a533cb85a6bf7bc571e3c933be6f07de21d967945d24fa`

See more details on using hashes here.

File details

Details for the file ragbox-0.1.0-py3-none-any.whl.

File metadata

Download URL: ragbox-0.1.0-py3-none-any.whl
Upload date: Oct 13, 2025
Size: 17.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for ragbox-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ca125485dae30bf44826e83e630ac34db034819639341f83d6370851f5cd5fe4`
MD5	`43bd310ade82617c03dcf3bb1257def1`
BLAKE2b-256	`141846ec9281bf03b3920147d2fb8ebff2172b83001c2378dcaaaeffb6d7f200`

See more details on using hashes here.

ragbox 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

RAG Box

Features

Installation

From PyPI (once published)

From Source

Quick Start

Using OpenAI (Recommended for Best Quality)

Using Ollama (Local, No API Key Needed)

Usage

Basic Commands

Command-Line Options

Configuration

Configuration File

Provider Options

Switching Between Providers

Environment Variables

How It Works

Supported File Types

Auto-Excluded

Examples

Analyze a Codebase

Query CCTV/Security Logs

Research Papers

Documentation

Interactive Session

Publishing to PyPI

Prerequisites

Build and Upload

Test Installation

Troubleshooting

OpenAI API Key Issues

Ollama Connection Issues

Index/Embedding Mismatch

No Documents Found

Performance Tips

Development

Roadmap

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes