Skip to main content

A flexible RAG library using SQL databases as vector stores through SQLAlchemy

Project description

SQLVector

A flexible and efficient Retrieval-Augmented Generation (RAG) library that uses SQL databases as vector stores through SQLAlchemy. Supports multiple backends including DuckDB with HNSW indexing and SQLite with VSS (Vector Similarity Search).

Python 3.12+ License: MIT Tests

Features

  • Multiple Backend Support: DuckDB with HNSW indexing, SQLite with VSS extension
  • Async and Sync Interfaces: Both SQLRAG (async) and SyncSQLRAG interfaces
  • Flexible Embedding Providers: Bring your own embeddings or use the default provider
  • Batch Operations: Efficient batch loading and querying
  • Multiple Similarity Functions: Cosine, Euclidean, Inner Product similarity
  • Metadata Filtering: Query with metadata filters and complex conditions
  • SQLAlchemy Integration: Works with any SQLAlchemy-supported database
  • Export Capabilities: Export to Polars DataFrames (DuckDB) or dictionaries

Installation

Basic Installation

pip install sqlvector

With DuckDB Support

pip install "sqlvector[duckdb]"

With SQLite Async Support

# For async SQLite support
pip install sqlvector aiosqlite sqlalchemy

With Custom Embedding Providers

If you want to use custom embedding providers (e.g., with Sentence Transformers), install the required dependencies:

# For Sentence Transformers based embeddings
pip install sqlvector transformers sentence-transformers torch

With Test Dependencies

pip install "sqlvector[test]"

Development Installation

git clone https://github.com/dinedal/sqlvector.git
cd sqlvector
pip install -e ".[duckdb,test]"

Quick Start

Using Custom Embeddings

import asyncio
from sqlalchemy.ext.asyncio import create_async_engine
from sqlvector import SQLRAG, EmbeddingProvider
from sentence_transformers import SentenceTransformer
from typing import List
import torch
import numpy as np

class CustomEmbeddingProvider(EmbeddingProvider):
    def __init__(self, batch_size=4, max_seq_length=512, use_gpu=False):
        MODEL_NAME = "sentence-transformers/all-MiniLM-L6-v2"
        
        device = "cpu"
        if use_gpu:
            if torch.cuda.is_available():
                device = "cuda"
            elif torch.backends.mps.is_available():
                device = "mps"
            else:
                raise RuntimeError(
                    "No suitable GPU found. Please check your PyTorch installation."
                )
        
        self.model = SentenceTransformer(
            MODEL_NAME,
            device=device,
        )
        # We can reduce the max_seq_length from the default for faster encoding
        self.model.max_seq_length = max_seq_length
        self.batch_size = batch_size
        
        super().__init__()
    
    async def embed(self, text: str) -> List[float]:
        return self.model.encode(text).tolist()
    
    async def embed_batch(self, texts: List[str]) -> List[List[float]]:
        return self.model.encode(texts, batch_size=self.batch_size).tolist()
    
    def similarity(self, vec1: List[float], vec2: List[float]) -> float:
        return self.model.similarity(
            np.array(vec1, dtype=np.float32),
            np.array(vec2, dtype=np.float32),
        ).item()

async def main():
    # Create async engine
    engine = create_async_engine("sqlite+aiosqlite:///example.db")
    
    # Initialize with custom embedding provider
    rag = SQLRAG(
        engine=engine, 
        embedding_provider=CustomEmbeddingProvider(use_gpu=True)
    )
    
    # Create tables
    await rag.create_tables()
    
    # Load documents
    documents = [
        {
            "content": "The quick brown fox jumps over the lazy dog.",
            "metadata": {"source": "example", "category": "animals"}
        },
        {
            "content": "Machine learning is a subset of artificial intelligence.",
            "metadata": {"source": "textbook", "category": "technology"}
        }
    ]
    
    document_ids = await rag.load_documents(documents)
    print(f"Loaded {len(document_ids)} documents")
    
    # Query similar documents
    results = await rag.query("artificial intelligence", top_k=5)
    
    for result in results:
        print(f"Content: {result['content']}")
        print(f"Similarity: {result['similarity']:.3f}")
        print(f"Metadata: {result.get('metadata', {})}")
    
    await engine.dispose()

if __name__ == "__main__":
    asyncio.run(main())

Using Synchronous Interface

from sqlalchemy import create_engine
from sqlvector import SyncSQLRAG

# Create sync engine
engine = create_engine("sqlite:///example.db")

# Initialize RAG
rag = SyncSQLRAG(engine=engine)

# Create tables
rag.create_tables()

# Load and query documents
documents = [{"content": "Your content here", "metadata": {"key": "value"}}]
document_ids = rag.load_documents(documents)

results = rag.query("search term", top_k=5)

Backend Configuration

DuckDB with HNSW Indexing

from sqlvector.backends.duckdb import DuckDBConfig, DuckDBRAG

config = DuckDBConfig(
    connection_string="duckdb:///rag.duckdb",
    embedding_dim=384,
    use_hnsw=True,  # Enable HNSW indexing
    hnsw_config={
        "max_elements": 100000,
        "ef_construction": 200,
        "ef": 100,
        "M": 16
    }
)

rag = DuckDBRAG(config=config)

SQLite with VSS Extension

from sqlvector.backends.sqlite import SQLiteConfig, SQLiteRAG

config = SQLiteConfig(
    connection_string="sqlite:///rag.db",
    embedding_dim=384,
    use_vss=True,  # Enable VSS extension
    vss_version="v0.1.2"
)

rag = SQLiteRAG(config=config)

Advanced Features

Metadata Filtering

# Query with metadata filters
results = await rag.query_with_filters(
    filters={"category": "technology", "year": 2025},
    query_text="machine learning",
    top_k=10
)

Batch Operations

# Batch document loading
documents = [{"content": f"Document {i}", "metadata": {"id": i}} for i in range(100)]
document_ids = await rag.load_documents(documents, batch_size=10)

# Batch querying
queries = ["query1", "query2", "query3"]
batch_results = await rag.query_batch(queries, top_k=5)

Export Data

# For DuckDB backend - export to Polars DataFrame
df = rag.export_to_polars(include_embeddings=True)

# For SQLite backend - export to dictionary
data = rag.export_documents(include_embeddings=False)

Architecture

SQLVector uses a protocol-based architecture that allows for flexible backend implementations:

  • Protocols: Define interfaces for backends (RAGSystemProtocol, DocumentLoaderProtocol, DocumentQuerierProtocol)
  • Backends: Pluggable database backends with specific optimizations
  • Embedding Service: Handles text-to-vector conversion with pluggable providers
  • Models: SQLAlchemy models for documents and embeddings

Performance Considerations

  • DuckDB: Best for analytical workloads, supports HNSW indexing for fast similarity search
  • SQLite: Lightweight option with VSS extension for vector similarity
  • Batch Size: Tune batch sizes based on your hardware and document sizes
  • Embedding Dimensions: Lower dimensions generally provide faster search at the cost of accuracy

Testing

Run the test suite:

# Run all tests
pytest

# Run specific backend tests
pytest tests/test_duckdb.py
pytest tests/test_sqlite.py

# Run with coverage
pytest --cov=sqlvector tests/

Contributing

We welcome contributions! Please see CONTRIBUTING.md for details on:

  • Development setup
  • Code style guidelines
  • Testing requirements
  • Pull request process

Examples

Check out the examples/ directory for more detailed examples:

  • Basic usage with different backends
  • Custom embedding providers
  • Advanced querying techniques
  • Performance benchmarks

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use SQLVector in your research, please cite:

@software{sqlvector,
  title = {SQLVector: SQL-based Retrieval-Augmented Generation},
  year = {2025},
  url = {https://github.com/dinedal/sqlvector}
}

Acknowledgments

  • Built on SQLAlchemy for database abstraction
  • DuckDB backend uses DuckDB with VSS extension
  • SQLite backend uses sqlite-vss for vector similarity

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlvector-0.1.0.tar.gz (78.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sqlvector-0.1.0-py3-none-any.whl (65.3 kB view details)

Uploaded Python 3

File details

Details for the file sqlvector-0.1.0.tar.gz.

File metadata

  • Download URL: sqlvector-0.1.0.tar.gz
  • Upload date:
  • Size: 78.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for sqlvector-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f01f35ebf1dbde3ca08ed9627906bd068300cd025439f2aa690014f113549914
MD5 4864dad207dce2f6d4e987c19174d384
BLAKE2b-256 03acf00aa93fbc3015a963c461cf8fc8516f77a45d8c598d01b2f0dbb395a50b

See more details on using hashes here.

File details

Details for the file sqlvector-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: sqlvector-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 65.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for sqlvector-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b58fe527261079ed3ddc66fee2cc882e2d0f2524417117e84942dba34e40ecc9
MD5 51c93d9ae0ff0489a16c727bbd00577d
BLAKE2b-256 fb407a5a064503781aa96b0226be041b2cef34ee926fc7bbd5af7ca45e4849b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page