Skip to main content

A command-line RAG tool for querying documents using LlamaIndex with OpenAI and Ollama support

Project description

RAG Box

A powerful command-line tool for querying documents using RAG (Retrieval Augmented Generation). Built with LlamaIndex and supports both OpenAI and Ollama models.

Features

  • ๐Ÿš€ Dual Provider Support: Use OpenAI (cloud) or Ollama (local) for LLM and embeddings
  • ๐Ÿ“š Smart Document Indexing: Automatically indexes documents with configurable embeddings
  • ๐Ÿ’ฌ Interactive & Direct Modes: Single questions or continuous conversations
  • ๐Ÿ”„ Streaming Responses: Real-time token streaming for faster perceived responses
  • ๐Ÿ’พ Persistent Vector Store: Reuses embeddings for instant subsequent queries
  • ๐ŸŽฏ Auto-Rebuild Detection: Automatically rebuilds index when embedding config changes
  • ๐Ÿ“Š Source Attribution: Shows which documents contributed to each answer
  • โš™๏ธ Flexible Configuration: JSON config files, environment variables, or CLI args
  • ๐ŸŽจ Beautiful Output: Formatted boxes with proper text wrapping

Installation

From PyPI (once published)

pip install ragbox

From Source

git clone https://github.com/praysimanjuntak/ragbox.git
cd ragbox
pip install -e .

Quick Start

Using OpenAI (Recommended for Best Quality)

# Set your API key
export OPENAI_API_KEY='sk-...'

# Ask a question
ragbox "What is this project about?"

Using Ollama (Local, No API Key Needed)

Prerequisites: Ollama server must be running with models pulled

# 1. Install and start Ollama server
# Download from https://ollama.ai
ollama serve

# 2. Pull models (required before using)
# Embedding model (choose one):
ollama pull embeddinggemma          # Recommended for embeddings

# LLM model (choose any model you prefer):
ollama pull granite4:micro          # Fast, 500MB
ollama pull llama3.2:3b            # More capable, 2GB
ollama pull qwen2.5:7b             # High quality, 4.7GB
ollama pull mistral:latest         # Or any other model you prefer

# 3. Create and configure
ragbox --init

# 4. Edit .rag_config.json to use ollama provider
# Set "llm_model" to any model you've pulled (e.g., "llama3.2:3b")
# 5. Ask questions
ragbox "What is this project about?"

Note: Replace granite4:micro with any Ollama model you've pulled. See available models at https://ollama.ai/library

Usage

Basic Commands

# Ask a single question
ragbox "What is the main purpose of this codebase?"

# Interactive mode
ragbox

# Specify documents directory
ragbox -d /path/to/docs "Summarize the key features"

# Force rebuild index
ragbox --rebuild

# List indexed files
ragbox --list-files

# Verbose output (show timing and config)
ragbox --verbose "question"

# Plain text output (easy to copy)
ragbox --format copy "question"

Command-Line Options

usage: ragbox [-h] [-d DOCS_DIR] [-s STORAGE_DIR] [-m MODEL] [--rebuild]
              [--list-files] [-v] [--format {box,copy}] [--init] [question]

positional arguments:
  question              Question to ask (if not provided, enters interactive mode)

options:
  -h, --help            show this help message and exit
  -d DOCS_DIR, --docs-dir DOCS_DIR
                        Directory containing documents (default: current directory)
  -s STORAGE_DIR, --storage-dir STORAGE_DIR
                        Directory for storing index (default: .storage)
  -m MODEL, --model MODEL
                        LLM model to use
  --rebuild             Force rebuild of index
  --list-files          List indexed files and exit
  -v, --verbose         Show detailed initialization logs
  --format {box,copy}   Output format: 'box' (default) or 'copy' (plain text)
  --init                Create a default .rag_config.json file

Configuration

Configuration File

Create a .rag_config.json file in your project directory:

ragbox --init

Example configuration:

{
  "embedding_model": "text-embedding-3-small",
  "embedding_provider": "openai",
  "embedding_base_url": "http://localhost:11434",
  "embedding_dimensions": 1536,
  "llm_model": "gpt-4o-mini",
  "llm_provider": "openai",
  "request_timeout": 360,
  "context_window": 32000,
  "chat_mode": "context",
  "streaming": true,
  "system_prompt": "You are a helpful assistant that analyzes documents and answers questions based on the provided context. Always cite relevant information from the documents."
}

Provider Options

OpenAI (Recommended)

{
  "embedding_provider": "openai",
  "embedding_model": "text-embedding-3-small",
  "llm_provider": "openai",
  "llm_model": "gpt-4o-mini"
}

Ollama (Local)

{
  "embedding_provider": "ollama",
  "embedding_model": "embeddinggemma",
  "llm_provider": "ollama",
  "llm_model": "granite4:micro"
}

Note: You can use any Ollama model for llm_model - just replace granite4:micro with any model you've pulled (e.g., llama3.2:3b, mistral:latest, qwen2.5:7b, etc.). See all available models at https://ollama.ai/library

Mix and Match Providers

You can use different providers for embeddings and LLM:

{
  "embedding_provider": "ollama",
  "embedding_model": "embeddinggemma",
  "llm_provider": "openai",
  "llm_model": "gpt-4o-mini"
}

Switching Between Providers

To switch from Ollama to OpenAI (or vice versa):

  1. Edit .rag_config.json and update the provider settings:

    {
      "embedding_provider": "openai",
      "embedding_model": "text-embedding-3-small",
      "llm_provider": "openai",
      "llm_model": "gpt-4o-mini"
    }
    
  2. Set your API key (for OpenAI):

    export OPENAI_API_KEY='sk-...'
    
  3. Rebuild the index (required when changing embedding provider):

    ragbox --rebuild
    

Note: When you change the embedding provider or model, the index will automatically rebuild on the next run.

Environment Variables

# Required for OpenAI
export OPENAI_API_KEY='sk-...'

# Optional overrides
export RAG_EMBEDDING_MODEL="text-embedding-3-small"
export RAG_EMBEDDING_PROVIDER="openai"
export RAG_LLM_MODEL="gpt-4o-mini"
export RAG_LLM_PROVIDER="openai"
export RAG_CHAT_MODE="context"
export RAG_STREAMING="true"

How It Works

  1. First Run: Loads documents โ†’ Creates embeddings โ†’ Saves vector index
  2. Subsequent Runs: Loads existing index (instant startup)
  3. Config Change Detection: Automatically rebuilds if embedding config changes
  4. Query Processing:
    • Embeds your question
    • Retrieves relevant document chunks
    • Sends context + question to LLM
    • Streams back the answer with sources

Supported File Types

  • Text files: .txt, .md, .rst
  • Code files: .py, .js, .java, .cpp, .go, .ts, .html, .css, .sh, etc.
  • Documents: .pdf, .docx, .epub, .ppt, .pptx, .pptm
  • Data files: .csv, .json, .yaml, .xml
  • Notebooks: .ipynb (Jupyter Notebooks)
  • Images: .png, .jpg, .jpeg (with OCR/vision capabilities)
  • Media: .mp3, .mp4 (audio/video transcription)
  • Email: .mbox (email archives)
  • Other: .hwp (Hangul Word Processor)

All files are processed via LlamaIndex's SimpleDirectoryReader, which automatically detects file types and uses appropriate parsers.

Auto-Excluded

  • .storage, .git, .venv, venv, node_modules
  • __pycache__, .pytest_cache, .mypy_cache
  • *.log, *.pyc

Examples

Analyze a Codebase

cd my-project
ragbox "Explain the authentication flow"
ragbox "Are there any security issues?"
ragbox "How is the database configured?"

Query CCTV/Security Logs

Perfect for analyzing large log files quickly:

# Point to your logs directory
cd /var/log/security
ragbox "Show me all failed login attempts from yesterday"
ragbox "Were there any suspicious access patterns?"
ragbox "Summarize the security events from IP 192.168.1.100"

# Or specify the directory
ragbox -d /var/log/cctv "When did motion detection trigger last night?"
ragbox -d /var/log/cctv "List all events between 10pm and 6am"

How it works: RAG CLI indexes all log files, allowing you to ask natural language questions instead of manually searching through thousands of lines. The AI retrieves relevant log entries and provides contextual answers.

Research Papers

ragbox -d ~/papers "Compare the methodologies"
ragbox -d ~/papers "What are the main findings?"

Documentation

ragbox -d ./docs "How do I setup the project?"
ragbox -d ./docs --rebuild  # After updating docs

Interactive Session

$ ragbox

โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
๐Ÿ’ฌ RAG Box - Interactive Mode
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”
๐Ÿ“ Documents: /home/user/docs
๐Ÿ“š Indexed files: 42
๐Ÿค– Model: gpt-4o-mini

Commands:
  /exit, /quit - Exit the program
  /files - List indexed files
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”

โฏ What are the main topics covered?
[Answer with sources...]

โฏ Tell me more about topic X
[Continued conversation with context...]

โฏ /exit
๐Ÿ‘‹ Goodbye!

Publishing to PyPI

Prerequisites

pip install build twine

Build and Upload

# 1. Update version in pyproject.toml
# 2. Build the package
python -m build

# 3. Upload to TestPyPI (optional, for testing)
twine upload --repository testpypi dist/*

# 4. Upload to PyPI
twine upload dist/*

Test Installation

# From TestPyPI
pip install --index-url https://test.pypi.org/simple/ ragbox

# From PyPI
pip install ragbox

Troubleshooting

OpenAI API Key Issues

# Verify key is set
echo $OPENAI_API_KEY

# Set the key
export OPENAI_API_KEY='sk-...'

Ollama Connection Issues

Important: Make sure Ollama server is running and required models are pulled!

# Start Ollama server (required!)
ollama serve

# Check what's running
ollama ps

# Pull required models if not already pulled
ollama pull embeddinggemma    # For embeddings
ollama pull granite4:micro    # For LLM (or any other model)

# Verify models are available
ollama list

# Keep models loaded in memory for faster response
ollama run granite4:micro
# Press Ctrl+D to exit but keep loaded

Common issues:

  • "Connection refused": Ollama server not running โ†’ Run ollama serve
  • "Model not found": Models not pulled โ†’ Run ollama pull <model-name>
  • Slow responses: Models loading from disk โ†’ Keep them loaded with ollama run <model>

Index/Embedding Mismatch

Don't worry! The tool automatically detects config changes and rebuilds:

# Or force rebuild manually
ragbox --rebuild

No Documents Found

# Check current directory
ls -la

# Specify directory explicitly
ragbox -d /path/to/docs "question"

Performance Tips

  1. OpenAI: Faster embeddings, better quality, requires API key
  2. Ollama: Free, local, but slower (keep models loaded: ollama run model)
  3. Index Reuse: First run is slow (embedding creation), subsequent runs are instant
  4. Streaming: Enabled by default for faster perceived response
  5. Model Choice: Smaller models (granite4:micro) faster, larger models (gpt-4o) better quality

Development

git clone https://github.com/praysimanjuntak/ragbox.git
cd ragbox
pip install -e ".[dev]"

Roadmap

  • OpenAI and Ollama support
  • Streaming responses
  • Auto-rebuild on config change
  • Source attribution
  • Interactive mode
  • Configurable output format
  • Image document support with OCR (planned for next update)
  • Web search integration
  • Conversation history export
  • Query caching

License

MIT License

Acknowledgments


Author: Pray Apostel Simanjuntak

If you find this project helpful, please consider giving it a โญ on GitHub!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragbox-0.1.0.tar.gz (20.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragbox-0.1.0-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file ragbox-0.1.0.tar.gz.

File metadata

  • Download URL: ragbox-0.1.0.tar.gz
  • Upload date:
  • Size: 20.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for ragbox-0.1.0.tar.gz
Algorithm Hash digest
SHA256 598a8abb7624e3b54875d17d82dd9f4dc28e3e3bd57140bfc5fd11d9010f1fa0
MD5 dedea24a69a2e27d249d8ecbd809115d
BLAKE2b-256 7ce1f8f9eca2510351a533cb85a6bf7bc571e3c933be6f07de21d967945d24fa

See more details on using hashes here.

File details

Details for the file ragbox-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ragbox-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for ragbox-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ca125485dae30bf44826e83e630ac34db034819639341f83d6370851f5cd5fe4
MD5 43bd310ade82617c03dcf3bb1257def1
BLAKE2b-256 141846ec9281bf03b3920147d2fb8ebff2172b83001c2378dcaaaeffb6d7f200

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page