Distributed SQLite-compatible storage engine backed by S3
Project description
distributed-sqlite
A distributed SQLite-compatible storage engine backed solely by AWS S3.
Overview
distributed-sqlite provides a standard SQLAlchemy/DBAPI2 interface over an
append-only, segment-based storage model on S3. It supports:
- Snapshot isolation — each transaction reads from a consistent snapshot
- Optimistic concurrency — CAS-based manifest commits with automatic retry
- Conflict detection — write-set intersection check; raises
ConflictErroron true conflicts - Exponential backoff with jitter — full jitter retry up to 10 attempts
- WAL-like semantics — immutable segments + versioned manifests, never mutates committed data
- Crash recovery — orphaned segments (written but not committed) are detected and safely ignored
- Alembic migrations — Alembic sees a standard SQLite interface; all DDL and migration ops work unchanged
- Local caching — LRU disk cache for segments, in-memory snapshot cache
Storage Layout
{bucket}/{prefix}/
manifests/v{N:020d}.json # Immutable manifest per version
segments/{uuid}.seg # Immutable append-only segments (msgpack)
root.json # Eventually-consistent version hint
Connection URL
distributed_sqlite+distributed_sqlite:///<bucket>/<prefix>
Quick Start
from distributed_sqlite.engine import bootstrap, open_connection, create_engine
# Initialize the store (idempotent)
bootstrap("my-bucket", "mydb")
# Raw DBAPI2 connection
with open_connection("my-bucket", "mydb") as conn:
cur = conn.cursor()
cur.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)")
cur.execute("INSERT INTO users VALUES (1, 'Alice')")
conn.commit()
# SQLAlchemy engine
import sqlalchemy as sa
engine = create_engine("distributed_sqlite+distributed_sqlite:///my-bucket/mydb")
Environment Variables
| Variable | Default | Description |
|---|---|---|
AWS_ACCESS_KEY_ID |
— | AWS credentials |
AWS_SECRET_ACCESS_KEY |
— | AWS credentials |
AWS_DEFAULT_REGION |
us-east-1 |
AWS region |
AWS_ENDPOINT_URL |
— | Custom endpoint (LocalStack, MinIO) |
DISTRIBUTED_SQLITE_CACHE_DIR |
~/.distributed_sqlite/cache |
Local cache directory |
DISTRIBUTED_SQLITE_CHECKPOINT_INTERVAL |
50 |
Delta segments between checkpoints |
DISTRIBUTED_SQLITE_MAX_RETRIES |
10 |
Max commit retry attempts |
DISTRIBUTED_SQLITE_RETRY_BASE_SECONDS |
0.05 |
Backoff base delay |
DISTRIBUTED_SQLITE_RETRY_MAX_SECONDS |
30.0 |
Max backoff delay |
Architecture
See docs/architecture.md for the full design narrative.
Development
cp .env.example .env # fill in your AWS credentials
uv sync
uv run pytest tests/ -v
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file distributed_sqlite-0.1.0.tar.gz.
File metadata
- Download URL: distributed_sqlite-0.1.0.tar.gz
- Upload date:
- Size: 93.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
477f99ae6913ffc635b8fde363185c0f070a294c82009b28e0e59e7bff37cb31
|
|
| MD5 |
d0f4eec2436d7e729bc86a78852de46e
|
|
| BLAKE2b-256 |
27b8f2d726dbdcdf3a243266445910e4d1627b5ae94e08e162eea75438005946
|
File details
Details for the file distributed_sqlite-0.1.0-py3-none-any.whl.
File metadata
- Download URL: distributed_sqlite-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
364c98c28cb012466e835ad2639e6b6c28e66eacc71ec6af4a1b412cfcd24c0a
|
|
| MD5 |
5d70c64838fd1f768c80a3c954588347
|
|
| BLAKE2b-256 |
24045121d58c8a3c7a2f787750d103fd48bb8622e3e05a8c18038fd2898de19a
|