Skip to main content

Distributed SQLite-compatible storage engine backed by S3

Project description

distributed-sqlite

A distributed SQLite-compatible storage engine backed solely by AWS S3.

Overview

distributed-sqlite provides a standard SQLAlchemy/DBAPI2 interface over an append-only, segment-based storage model on S3. It supports:

  • Snapshot isolation — each transaction reads from a consistent snapshot
  • Optimistic concurrency — CAS-based manifest commits with automatic retry
  • Conflict detection — write-set intersection check; raises ConflictError on true conflicts
  • Exponential backoff with jitter — full jitter retry up to 10 attempts
  • WAL-like semantics — immutable segments + versioned manifests, never mutates committed data
  • Crash recovery — orphaned segments (written but not committed) are detected and safely ignored
  • Alembic migrations — Alembic sees a standard SQLite interface; all DDL and migration ops work unchanged
  • Local caching — LRU disk cache for segments, in-memory snapshot cache

Storage Layout

{bucket}/{prefix}/
  manifests/v{N:020d}.json   # Immutable manifest per version
  segments/{uuid}.seg        # Immutable append-only segments (msgpack)
  root.json                  # Eventually-consistent version hint

Connection URL

distributed_sqlite+distributed_sqlite:///<bucket>/<prefix>

Quick Start

from distributed_sqlite.engine import bootstrap, open_connection, create_engine

# Initialize the store (idempotent)
bootstrap("my-bucket", "mydb")

# Raw DBAPI2 connection
with open_connection("my-bucket", "mydb") as conn:
    cur = conn.cursor()
    cur.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)")
    cur.execute("INSERT INTO users VALUES (1, 'Alice')")
    conn.commit()

# SQLAlchemy engine
import sqlalchemy as sa
engine = create_engine("distributed_sqlite+distributed_sqlite:///my-bucket/mydb")

Environment Variables

Variable Default Description
AWS_ACCESS_KEY_ID AWS credentials
AWS_SECRET_ACCESS_KEY AWS credentials
AWS_DEFAULT_REGION us-east-1 AWS region
AWS_ENDPOINT_URL Custom endpoint (LocalStack, MinIO)
DISTRIBUTED_SQLITE_CACHE_DIR ~/.distributed_sqlite/cache Local cache directory
DISTRIBUTED_SQLITE_CHECKPOINT_INTERVAL 50 Delta segments between checkpoints
DISTRIBUTED_SQLITE_MAX_RETRIES 10 Max commit retry attempts
DISTRIBUTED_SQLITE_RETRY_BASE_SECONDS 0.05 Backoff base delay
DISTRIBUTED_SQLITE_RETRY_MAX_SECONDS 30.0 Max backoff delay

Architecture

See docs/architecture.md for the full design narrative.

Development

cp .env.example .env   # fill in your AWS credentials
uv sync
uv run pytest tests/ -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

distributed_sqlite-0.2.0.tar.gz (94.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

distributed_sqlite-0.2.0-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file distributed_sqlite-0.2.0.tar.gz.

File metadata

  • Download URL: distributed_sqlite-0.2.0.tar.gz
  • Upload date:
  • Size: 94.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for distributed_sqlite-0.2.0.tar.gz
Algorithm Hash digest
SHA256 dba589d7eafa7dad68490c8864b583dbd6692d69c341250aed9a796ab1170f99
MD5 ba5c878bdf8eb1fab7187585b338e633
BLAKE2b-256 b6fff8c1bb7cb4310718dc5ad7618f4ff97e1f92508c3ad9eda4428b0de9fade

See more details on using hashes here.

File details

Details for the file distributed_sqlite-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for distributed_sqlite-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 22286ad1da22d190e919b88e283818c87e113417bff44d44b306a4c33bc4eb46
MD5 3a4e18f625fdf19104e3de209b5f255e
BLAKE2b-256 ec48207eea37c42460373e8f0456523751f48337789e1a7e9b681f5c50dfb4b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page