Vector Bot: A fully local RAG pipeline using LlamaIndex with Ollama

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Joshua2048

These details have not been verified by PyPI

Project description

Vector Bot

Vector Bot is a fully local Retrieval-Augmented Generation (RAG) pipeline using LlamaIndex with Ollama. Ask natural language questions about your documents, with everything running offline on your computer.

📖 Documentation

Complete documentation is available in the docs/ directory:

Getting Started - Complete documentation portal and user journey paths
User Guide - Complete user guide with examples and troubleshooting
Quick Reference - Command cheat sheet
Configuration - Configuration reference
Deployment Guide - Multi-environment deployment
Contributing - Development guidelines
Security - Security policy and vulnerability reporting

Features

100% Local: No cloud APIs, no telemetry, fully offline after installation
Multi-Environment: Development, production, and Docker configurations
Executable Distribution: Single-file deployment with no Python required
Document Support: PDF, Markdown, text, JSON, CSV files
Persistent Storage: Indexes are saved to disk for fast subsequent queries
Clean CLI: Simple command-line interface with doctor, ingest, and query commands
CI/CD Pipeline: Automated testing, building, and PyPI publishing
Comprehensive Testing: 324 passing tests with 99% code coverage
Security Scanning: Automated vulnerability detection and CodeQL analysis

Prerequisites

Python 3.10+ installed

Ollama installed and running

# Check Ollama version
ollama --version

# Start Ollama server (if not running)
ollama serve

# List installed models
ollama list

At least one chat model installed in Ollama:

# If you don't have any models, install one:
ollama pull llama3.1

Quick Start

Choose one of the following install methods, then follow the same steps to verify, index, and query your documents:

Node.js Package (npm) — Recommended for most users

# Install globally
npm install -g @joshuaramirez/vector-bot

# Run commands
vector-bot doctor
vector-bot ingest
vector-bot query "What is this document about?"

# Or run without installing
npx @joshuaramirez/vector-bot --help

Python Package (pip)

# Install from PyPI
pip install vector-bot

Download Executable (manual)

Download the latest release for your platform:
- Windows: vector-bot.exe
- macOS/Linux: vector-bot (then run chmod +x vector-bot)

After installing with any method above:

Install Ollama and a chat model:

# Install Ollama from https://ollama.ai
ollama pull llama3.1
ollama pull nomic-embed-text

Verify setup:
```
vector-bot doctor
```

Add documents and index:

mkdir docs
cp your-files.pdf docs/
vector-bot ingest

Ask questions:

vector-bot query "What is this document about?"

See User Guide for complete instructions.

From Source

# Clone and enter the project directory
cd vector-bot

# Copy environment template
cp .env.example .env

# Edit .env and set OLLAMA_CHAT_MODEL to one of your installed models
# For example: OLLAMA_CHAT_MODEL=llama3.1

# Install the package
pip install -e .

# Or with development dependencies
pip install -e ".[dev]"

Example Workflow

# 1. Check system status
vector-bot doctor

# 2. Add your documents
mkdir docs
cp *.pdf docs/
cp *.md docs/

# 3. Index documents
vector-bot ingest

# 4. Ask questions
vector-bot query "What are the main topics covered?"
vector-bot query "Summarize the key findings" --show-sources
vector-bot query "What does the document say about security?" --k 8

# 5. Use different environments
vector-bot --env production ingest
vector-bot --env development query "How do I deploy this?"

Windows Instructions

For Windows users without make, use these Python commands directly:

# Install
pip install -e .

# Check health
python -m rag.cli doctor

# Ingest documents
python -m rag.cli ingest

# Query
python -m rag.cli query "Your question here"

# Run smoke test
python scripts/rag_smoke.py

# Run unit tests
pytest tests/ -v

Configuration

Edit .env or set environment variables:

DOCS_DIR: Directory containing documents (default: ./docs)
INDEX_DIR: Directory for storing index (default: ./index_storage)
OLLAMA_BASE_URL: Ollama server URL (default: http://localhost:11434)
OLLAMA_CHAT_MODEL: Chat model to use (no default - uses auto-detection)
OLLAMA_EMBED_MODEL: Embedding model (default: nomic-embed-text)
SIMILARITY_TOP_K: Number of similar chunks to retrieve (default: 4)

Project Structure

.
├── src/rag/          # Main package (vector-bot)
│   ├── cli.py        # CLI interface
│   ├── config.py     # Configuration management
│   ├── ingest.py     # Document ingestion
│   ├── query.py      # Query engine
│   └── ollama_check.py # Ollama health checks
├── scripts/          # Utility scripts
│   └── rag_smoke.py  # Smoke test
├── tests/            # Unit tests
├── docs/             # Your documents go here
└── index_storage/    # Persisted vector index

Troubleshooting

Ollama Server Not Running

# Start Ollama
ollama serve

# Check if it's running (should show version info)
curl http://localhost:11434/api/version

No Models Found

# List available models
ollama list

# Pull a model (but NOT required if you already have one)
ollama pull llama3.1

Port Conflicts

If Ollama is running on a different port, update .env:

OLLAMA_BASE_URL=http://localhost:YOUR_PORT

Large Files Skipped

Files over 20MB are automatically skipped during ingestion. Split large documents or adjust the limit in the code if needed.

Missing Embedding Model

The default embedding model is nomic-embed-text. If not installed:

ollama pull nomic-embed-text

# Or use an alternative like mxbai-embed-large
ollama pull mxbai-embed-large
# Then update .env: OLLAMA_EMBED_MODEL=mxbai-embed-large

Important Notes

No Auto-Pull for Chat Models: This tool will NEVER automatically download chat models. It only uses models you've already installed.
Idempotent Ingestion: Re-running ingest is safe and won't duplicate data.
Fully Offline: After pip install, no internet connection is required.
Local Only: All operations use localhost - no external API calls.

Multi-Environment Support

The application supports different deployment environments:

# Development (default)
vector-bot doctor

# Production deployment
vector-bot --env production doctor

# Docker deployment  
vector-bot --env docker doctor

# Show current configuration
vector-bot --config-info --env production

See Deployment Guide for detailed multi-environment setup.

Building Executable

To create a standalone executable:

# Build executable (all platforms)
make build-exe

# Or manually
python build_executable.py

This creates a single executable file in dist/vector-bot (or dist/vector-bot.exe on Windows) that includes all dependencies and configuration files. The target system only needs:

Ollama installed and running
No Python installation required

Development

# Run all tests
make test
pytest tests/ -v

# Run unit tests only
pytest tests/unit/ -v

# Run with coverage
pytest tests/ --cov=src/rag --cov-report=html

# Type checking
mypy src/

# Linting
ruff check src/

# Run security checks
safety check
bandit -r src/

# Use the test runner
python run_tests.py

# Clean generated files (including build artifacts)
make clean

Testing

The project includes comprehensive testing:

114 unit tests passing (99.1% pass rate)
99% code coverage across all modules
100% mocked external dependencies - tests run offline
Professional test structure following best practices
CI-ready - all tests run in under 20 seconds

See Testing Documentation for detailed testing documentation.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Joshua2048

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jan 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vector_bot-0.1.0.tar.gz (28.6 kB view details)

Uploaded Jan 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vector_bot-0.1.0-py3-none-any.whl (27.1 kB view details)

Uploaded Jan 16, 2026 Python 3

File details

Details for the file vector_bot-0.1.0.tar.gz.

File metadata

Download URL: vector_bot-0.1.0.tar.gz
Upload date: Jan 16, 2026
Size: 28.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vector_bot-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`651838a4a2e24afcef4b68f4157fde114b4d32fab8f29f48253c14f9a0007f58`
MD5	`a87c6bc875ebf14c3d62ba79e421abd7`
BLAKE2b-256	`406a92786b0c9b9ea1d65ae4b4608c9d006ee3097eee7e8a154ec0198d420339`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_bot-0.1.0.tar.gz:

Publisher: publish.yml on JoshuaRamirez/VectorBot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vector_bot-0.1.0.tar.gz
- Subject digest: 651838a4a2e24afcef4b68f4157fde114b4d32fab8f29f48253c14f9a0007f58
- Sigstore transparency entry: 832374166
- Sigstore integration time: Jan 16, 2026
Source repository:
- Permalink: JoshuaRamirez/VectorBot@de7d047ef401830af7f93421be0daa1ee23542a1
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/JoshuaRamirez
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@de7d047ef401830af7f93421be0daa1ee23542a1
- Trigger Event: release

File details

Details for the file vector_bot-0.1.0-py3-none-any.whl.

File metadata

Download URL: vector_bot-0.1.0-py3-none-any.whl
Upload date: Jan 16, 2026
Size: 27.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vector_bot-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5f030257964f229b12977b4bf01eff22faec612c4cd29d7163c8b176c457d437`
MD5	`9e30ecd8be9bfc9eb605ce3278a5f390`
BLAKE2b-256	`d6a2970c9075e10a7857d5e64e33a4e3b99eecd0aef857d7c36547a22d0e76e5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_bot-0.1.0-py3-none-any.whl:

Publisher: publish.yml on JoshuaRamirez/VectorBot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vector_bot-0.1.0-py3-none-any.whl
- Subject digest: 5f030257964f229b12977b4bf01eff22faec612c4cd29d7163c8b176c457d437
- Sigstore transparency entry: 832374169
- Sigstore integration time: Jan 16, 2026
Source repository:
- Permalink: JoshuaRamirez/VectorBot@de7d047ef401830af7f93421be0daa1ee23542a1
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/JoshuaRamirez
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@de7d047ef401830af7f93421be0daa1ee23542a1
- Trigger Event: release

vector-bot 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Vector Bot

📖 Documentation

Features

Prerequisites

Quick Start

Node.js Package (npm) — Recommended for most users

Python Package (pip)

Download Executable (manual)

From Source

Example Workflow

Windows Instructions

Configuration

Project Structure

Troubleshooting

Ollama Server Not Running

No Models Found

Port Conflicts

Large Files Skipped

Missing Embedding Model

Important Notes

Multi-Environment Support

Building Executable

Development

Testing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance