Skip to main content

Distributed SQLite-compatible storage engine backed by S3

Project description

distributed-sqlite

A distributed SQLite-compatible storage engine backed solely by AWS S3.

Overview

distributed-sqlite provides a standard SQLAlchemy/DBAPI2 interface over an append-only, segment-based storage model on S3. It supports:

  • Snapshot isolation — each transaction reads from a consistent snapshot
  • Optimistic concurrency — CAS-based manifest commits with automatic retry
  • Conflict detection — write-set intersection check; raises ConflictError on true conflicts
  • Exponential backoff with jitter — full jitter retry up to 10 attempts
  • WAL-like semantics — immutable segments + versioned manifests, never mutates committed data
  • Crash recovery — orphaned segments (written but not committed) are detected and safely ignored
  • Alembic migrations — Alembic sees a standard SQLite interface; all DDL and migration ops work unchanged
  • Local caching — LRU disk cache for segments, in-memory snapshot cache

Storage Layout

{bucket}/{prefix}/
  manifests/v{N:020d}.json   # Immutable manifest per version
  segments/{uuid}.seg        # Immutable append-only segments (msgpack)
  root.json                  # Eventually-consistent version hint

Connection URL

distributed_sqlite+distributed_sqlite:///<bucket>/<prefix>

Quick Start

from distributed_sqlite.engine import bootstrap, open_connection, create_engine

# Initialize the store (idempotent)
bootstrap("my-bucket", "mydb")

# Raw DBAPI2 connection
with open_connection("my-bucket", "mydb") as conn:
    cur = conn.cursor()
    cur.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)")
    cur.execute("INSERT INTO users VALUES (1, 'Alice')")
    conn.commit()

# SQLAlchemy engine
import sqlalchemy as sa
engine = create_engine("distributed_sqlite+distributed_sqlite:///my-bucket/mydb")

Environment Variables

Variable Default Description
AWS_ACCESS_KEY_ID AWS credentials
AWS_SECRET_ACCESS_KEY AWS credentials
AWS_DEFAULT_REGION us-east-1 AWS region
AWS_ENDPOINT_URL Custom endpoint (LocalStack, MinIO)
DISTRIBUTED_SQLITE_CACHE_DIR ~/.distributed_sqlite/cache Local cache directory
DISTRIBUTED_SQLITE_CHECKPOINT_INTERVAL 50 Delta segments between checkpoints
DISTRIBUTED_SQLITE_MAX_RETRIES 10 Max commit retry attempts
DISTRIBUTED_SQLITE_RETRY_BASE_SECONDS 0.05 Backoff base delay
DISTRIBUTED_SQLITE_RETRY_MAX_SECONDS 30.0 Max backoff delay

Architecture

See docs/architecture.md for the full design narrative.

Development

cp .env.example .env   # fill in your AWS credentials
uv sync
uv run pytest tests/ -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

distributed_sqlite-0.1.0.tar.gz (93.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

distributed_sqlite-0.1.0-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file distributed_sqlite-0.1.0.tar.gz.

File metadata

  • Download URL: distributed_sqlite-0.1.0.tar.gz
  • Upload date:
  • Size: 93.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for distributed_sqlite-0.1.0.tar.gz
Algorithm Hash digest
SHA256 477f99ae6913ffc635b8fde363185c0f070a294c82009b28e0e59e7bff37cb31
MD5 d0f4eec2436d7e729bc86a78852de46e
BLAKE2b-256 27b8f2d726dbdcdf3a243266445910e4d1627b5ae94e08e162eea75438005946

See more details on using hashes here.

File details

Details for the file distributed_sqlite-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for distributed_sqlite-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 364c98c28cb012466e835ad2639e6b6c28e66eacc71ec6af4a1b412cfcd24c0a
MD5 5d70c64838fd1f768c80a3c954588347
BLAKE2b-256 24045121d58c8a3c7a2f787750d103fd48bb8622e3e05a8c18038fd2898de19a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page