Skip to main content

Hybird search with SQLite AI and SQLite Vector

Project description

https://sqlite.ai

SQLite RAG

Run Tests codecov PyPI - Version PyPI - Python Version

A hybrid search engine built on SQLite with SQLite AI and SQLite Vector extensions. SQLite RAG combines vector similarity search with full-text search (FTS5 extension) using Reciprocal Rank Fusion (RRF) for enhanced document retrieval.

Features

  • Hybrid Search: Combines vector embeddings with full-text search for optimal results
  • SQLite-based: Built on SQLite with AI and Vector extensions for reliability and performance
  • Multi-format Text Support: Process text file formats including PDF, DOCX, Markdown, code files
  • Recursive Character Text Splitter: Token-aware text chunking with configurable overlap
  • Interactive CLI: Command-line interface with interactive REPL mode
  • Flexible Configuration: Customizable embedding models, search weights, and chunking parameters

Installation

Prerequisites

SQLite RAG requires SQLite with extension loading support. If you encounter extension loading issues (e.g., 'sqlite3.Connection' object has no attribute 'enable_load_extension'), follow the setup guides for macOS or Windows.

Install SQLite RAG

python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install sqlite-rag

Quick Start

Download the model Embedding Gemma from Hugging Face chosen as default model:

sqlite-rag download-model unsloth/embeddinggemma-300m-GGUF embeddinggemma-300M-Q8_0.gguf

SQLite RAG comes preconfigured to work with the Embedding Gemma model. When you add a document or text, it automatically creates a new database (if one does not already exist) and uses default settings, so you can get started immediately without manual setup.

# Initialize sqliterag.sqlite database and add documents
sqlite-rag add-text "Artificial intelligence (AI) enables machines to learn from data"

sqlite-rag add /path/to/documents --recursive

# Search your documents
sqlite-rag search "explain AI"

# Interactive mode
sqlite-rag
> help
> search "interactive search"
> exit

For help run:

sqlite-rag --help

CLI Commands

Configuration

Settings are stored in the database and should be set before adding any documents.

# View available configuration options
sqlite-rag configure --help

sqlite-rag configure --model-path ./mymodels/path

# View current settings
sqlite-rag settings

To use a different database filename, use the global --database option:

# Single command with custom database
sqlite-rag --database path/to/mydb.db add-text "Let's talk about AI."

# Interactive mode with custom database
sqlite-rag --database path/to/mydb.db

Model Management

You can experiment with other models from Hugging Face by downloading them with:

# Download GGUF models from Hugging Face
sqlite-rag download-model <model-repo> <filename>

Supported File Formats

SQLite RAG supports the following file formats:

  • Text: .txt, .md, .mdx, .csv, .json, .xml, .yaml, .yml
  • Documents: .pdf, .docx, .pptx, .xlsx
  • Code: .c, .cpp, .css, .go, .h, .hpp, .html, .java, .js, .mjs, .kt, .php, .py, .rb, .rs, .swift, .ts, .tsx
  • Web Frameworks: .svelte, .vue

Development

Installation

For development, clone the repository and install with development dependencies:

# Clone the repository
git clone https://github.com/sqliteai/sqlite-rag.git
cd sqlite-rag

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install in development mode
pip install -e '.[dev]'

How It Works

  1. Document Processing: Files are processed and split into overlapping chunks
  2. Embedding Generation: Text chunks are converted to vector embeddings using AI models
  3. Dual Indexing: Content is indexed for both vector similarity and full-text search
  4. Hybrid Search: Queries are processed through both search methods
  5. Result Fusion: Results are combined using Reciprocal Rank Fusion for optimal relevance

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sqlite_rag-0.1.4-py3-none-any.whl (33.8 kB view details)

Uploaded Python 3

File details

Details for the file sqlite_rag-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: sqlite_rag-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 33.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sqlite_rag-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 2381a56be1375b3028a28d220769a5be00ed10c2ac895a67a5d2b487d240f614
MD5 c5bb0e6975e556cd40a656772b3e37aa
BLAKE2b-256 a1d0ea1793eddc73d4aed8dc86b7ef2ca0df208c78f76a4ff8c6dd7c1dc4b9e1

See more details on using hashes here.

Provenance

The following attestation bundles were made for sqlite_rag-0.1.4-py3-none-any.whl:

Publisher: pypi-package.yaml on sqliteai/sqlite-rag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page