Vector Bot: A fully local RAG pipeline using LlamaIndex with Ollama
Project description
Vector Bot
Vector Bot is a fully local Retrieval-Augmented Generation (RAG) pipeline using LlamaIndex with Ollama. Ask natural language questions about your documents, with everything running offline on your computer.
📖 Documentation
Complete documentation is available in the docs/ directory:
- Getting Started - Complete documentation portal and user journey paths
- User Guide - Complete user guide with examples and troubleshooting
- Quick Reference - Command cheat sheet
- Configuration - Configuration reference
- Deployment Guide - Multi-environment deployment
- Contributing - Development guidelines
- Security - Security policy and vulnerability reporting
Features
- 100% Local: No cloud APIs, no telemetry, fully offline after installation
- Multi-Environment: Development, production, and Docker configurations
- Executable Distribution: Single-file deployment with no Python required
- Document Support: PDF, Markdown, text, JSON, CSV files
- Persistent Storage: Indexes are saved to disk for fast subsequent queries
- Clean CLI: Simple command-line interface with doctor, ingest, and query commands
- CI/CD Pipeline: Automated testing, building, and PyPI publishing
- Comprehensive Testing: 324 passing tests with 99% code coverage
- Security Scanning: Automated vulnerability detection and CodeQL analysis
Prerequisites
-
Python 3.10+ installed
-
Ollama installed and running
# Check Ollama version ollama --version # Start Ollama server (if not running) ollama serve # List installed models ollama list
-
At least one chat model installed in Ollama:
# If you don't have any models, install one: ollama pull llama3.1
Quick Start
Choose one of the following install methods, then follow the same steps to verify, index, and query your documents:
Node.js Package (npm) — Recommended for most users
# Install globally
npm install -g @joshuaramirez/vector-bot
# Run commands
vector-bot doctor
vector-bot ingest
vector-bot query "What is this document about?"
# Or run without installing
npx @joshuaramirez/vector-bot --help
Python Package (pip)
# Install from PyPI
pip install vector-bot
Download Executable (manual)
- Download the latest release for your platform:
- Windows:
vector-bot.exe - macOS/Linux:
vector-bot(then runchmod +x vector-bot)
- Windows:
After installing with any method above:
-
Install Ollama and a chat model:
# Install Ollama from https://ollama.ai ollama pull llama3.1 ollama pull nomic-embed-text
-
Verify setup:
vector-bot doctor -
Add documents and index:
mkdir docs cp your-files.pdf docs/ vector-bot ingest
-
Ask questions:
vector-bot query "What is this document about?"
See User Guide for complete instructions.
From Source
# Clone and enter the project directory
cd vector-bot
# Copy environment template
cp .env.example .env
# Edit .env and set OLLAMA_CHAT_MODEL to one of your installed models
# For example: OLLAMA_CHAT_MODEL=llama3.1
# Install the package
pip install -e .
# Or with development dependencies
pip install -e ".[dev]"
Example Workflow
# 1. Check system status
vector-bot doctor
# 2. Add your documents
mkdir docs
cp *.pdf docs/
cp *.md docs/
# 3. Index documents
vector-bot ingest
# 4. Ask questions
vector-bot query "What are the main topics covered?"
vector-bot query "Summarize the key findings" --show-sources
vector-bot query "What does the document say about security?" --k 8
# 5. Use different environments
vector-bot --env production ingest
vector-bot --env development query "How do I deploy this?"
Windows Instructions
For Windows users without make, use these Python commands directly:
# Install
pip install -e .
# Check health
python -m rag.cli doctor
# Ingest documents
python -m rag.cli ingest
# Query
python -m rag.cli query "Your question here"
# Run smoke test
python scripts/rag_smoke.py
# Run unit tests
pytest tests/ -v
Configuration
Edit .env or set environment variables:
DOCS_DIR: Directory containing documents (default:./docs)INDEX_DIR: Directory for storing index (default:./index_storage)OLLAMA_BASE_URL: Ollama server URL (default:http://localhost:11434)OLLAMA_CHAT_MODEL: Chat model to use (no default - uses auto-detection)OLLAMA_EMBED_MODEL: Embedding model (default:nomic-embed-text)SIMILARITY_TOP_K: Number of similar chunks to retrieve (default: 4)
Project Structure
.
├── src/rag/ # Main package (vector-bot)
│ ├── cli.py # CLI interface
│ ├── config.py # Configuration management
│ ├── ingest.py # Document ingestion
│ ├── query.py # Query engine
│ └── ollama_check.py # Ollama health checks
├── scripts/ # Utility scripts
│ └── rag_smoke.py # Smoke test
├── tests/ # Unit tests
├── docs/ # Your documents go here
└── index_storage/ # Persisted vector index
Troubleshooting
Ollama Server Not Running
# Start Ollama
ollama serve
# Check if it's running (should show version info)
curl http://localhost:11434/api/version
No Models Found
# List available models
ollama list
# Pull a model (but NOT required if you already have one)
ollama pull llama3.1
Port Conflicts
If Ollama is running on a different port, update .env:
OLLAMA_BASE_URL=http://localhost:YOUR_PORT
Large Files Skipped
Files over 20MB are automatically skipped during ingestion. Split large documents or adjust the limit in the code if needed.
Missing Embedding Model
The default embedding model is nomic-embed-text. If not installed:
ollama pull nomic-embed-text
# Or use an alternative like mxbai-embed-large
ollama pull mxbai-embed-large
# Then update .env: OLLAMA_EMBED_MODEL=mxbai-embed-large
Important Notes
- No Auto-Pull for Chat Models: This tool will NEVER automatically download chat models. It only uses models you've already installed.
- Idempotent Ingestion: Re-running
ingestis safe and won't duplicate data. - Fully Offline: After
pip install, no internet connection is required. - Local Only: All operations use
localhost- no external API calls.
Multi-Environment Support
The application supports different deployment environments:
# Development (default)
vector-bot doctor
# Production deployment
vector-bot --env production doctor
# Docker deployment
vector-bot --env docker doctor
# Show current configuration
vector-bot --config-info --env production
See Deployment Guide for detailed multi-environment setup.
Building Executable
To create a standalone executable:
# Build executable (all platforms)
make build-exe
# Or manually
python build_executable.py
This creates a single executable file in dist/vector-bot (or dist/vector-bot.exe on Windows) that includes all dependencies and configuration files. The target system only needs:
- Ollama installed and running
- No Python installation required
Development
# Run all tests
make test
pytest tests/ -v
# Run unit tests only
pytest tests/unit/ -v
# Run with coverage
pytest tests/ --cov=src/rag --cov-report=html
# Type checking
mypy src/
# Linting
ruff check src/
# Run security checks
safety check
bandit -r src/
# Use the test runner
python run_tests.py
# Clean generated files (including build artifacts)
make clean
Testing
The project includes comprehensive testing:
- 114 unit tests passing (99.1% pass rate)
- 99% code coverage across all modules
- 100% mocked external dependencies - tests run offline
- Professional test structure following best practices
- CI-ready - all tests run in under 20 seconds
See Testing Documentation for detailed testing documentation.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vector_bot-0.1.0.tar.gz.
File metadata
- Download URL: vector_bot-0.1.0.tar.gz
- Upload date:
- Size: 28.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
651838a4a2e24afcef4b68f4157fde114b4d32fab8f29f48253c14f9a0007f58
|
|
| MD5 |
a87c6bc875ebf14c3d62ba79e421abd7
|
|
| BLAKE2b-256 |
406a92786b0c9b9ea1d65ae4b4608c9d006ee3097eee7e8a154ec0198d420339
|
Provenance
The following attestation bundles were made for vector_bot-0.1.0.tar.gz:
Publisher:
publish.yml on JoshuaRamirez/VectorBot
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vector_bot-0.1.0.tar.gz -
Subject digest:
651838a4a2e24afcef4b68f4157fde114b4d32fab8f29f48253c14f9a0007f58 - Sigstore transparency entry: 832374166
- Sigstore integration time:
-
Permalink:
JoshuaRamirez/VectorBot@de7d047ef401830af7f93421be0daa1ee23542a1 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/JoshuaRamirez
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@de7d047ef401830af7f93421be0daa1ee23542a1 -
Trigger Event:
release
-
Statement type:
File details
Details for the file vector_bot-0.1.0-py3-none-any.whl.
File metadata
- Download URL: vector_bot-0.1.0-py3-none-any.whl
- Upload date:
- Size: 27.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f030257964f229b12977b4bf01eff22faec612c4cd29d7163c8b176c457d437
|
|
| MD5 |
9e30ecd8be9bfc9eb605ce3278a5f390
|
|
| BLAKE2b-256 |
d6a2970c9075e10a7857d5e64e33a4e3b99eecd0aef857d7c36547a22d0e76e5
|
Provenance
The following attestation bundles were made for vector_bot-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on JoshuaRamirez/VectorBot
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vector_bot-0.1.0-py3-none-any.whl -
Subject digest:
5f030257964f229b12977b4bf01eff22faec612c4cd29d7163c8b176c457d437 - Sigstore transparency entry: 832374169
- Sigstore integration time:
-
Permalink:
JoshuaRamirez/VectorBot@de7d047ef401830af7f93421be0daa1ee23542a1 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/JoshuaRamirez
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@de7d047ef401830af7f93421be0daa1ee23542a1 -
Trigger Event:
release
-
Statement type: