SuperTable — versioned data lake library for SQL analytics on Parquet + Redis.

These details have not been verified by PyPI

Project links

Homepage

Project description

SuperTable

Python License: STPUL Version

SuperTable — versioned data lake library for SQL analytics.

SuperTable stores structured data as immutable Parquet snapshots on object storage (S3, MinIO, Azure Blob, GCP Cloud Storage, or local disk), keeps metadata, locks, and audit state in Redis, and queries everything through DuckDB (embedded) or Spark SQL. It is a Python library — there is no separate server process.

Installation

pip install supertable                # core + local storage
pip install "supertable[s3]"          # AWS S3
pip install "supertable[minio]"       # MinIO
pip install "supertable[azure]"       # Azure Blob
pip install "supertable[gcp]"         # Google Cloud Storage
pip install "supertable[all]"         # everything

Requirements: Python 3.10+, a reachable Redis 6+, and a configured storage backend (or local disk for development). See docs/02_configuration.md for environment variables.

Architecture

┌──────────────────────────────────────────────────┐
│                Python application                 │
│   (notebooks, ETL jobs, FastAPI handlers, etc.)   │
└──────────┬─────────────────────────┬──────────────┘
           │ DataWriter / DataReader │
           ▼                         ▼
   ┌───────────────┐        ┌────────────────────┐
   │  RedisCatalog │        │  StorageInterface  │
   │  metadata     │        │  Parquet files     │
   │  locks        │        │  S3 / MinIO /      │
   │  audit chain  │        │  Azure / GCP /     │
   └───────────────┘        │  Local             │
                            └────────────────────┘

Data is organised as Organization → SuperTable → SimpleTable. Each SimpleTable is a versioned, append-only collection of Parquet files backed by a snapshot linked list — every write produces a new immutable snapshot whose previous_snapshot points at the predecessor.

Layer	Technology
Language	Python 3.10+
Metadata store	Redis 6+ (standalone or Sentinel HA)
Query engine (primary)	DuckDB
Query engine (large)	Spark SQL via Thrift
Data format	Apache Parquet
Object storage	MinIO / S3 / Azure / GCP / local
Mirror formats	Delta Lake, Apache Iceberg, Parquet
Audit storage	Redis Streams + Parquet

Quick example

from supertable import SuperTable, DataWriter, DataReader, engine

# Bootstrap catalogue + storage
SuperTable(super_name="example", organization="my-org")

# Write
dw = DataWriter(super_name="example", organization="my-org")
columns, rows, inserted, deleted = dw.write(
    role_name="superadmin",
    simple_name="facts",
    data=arrow_table,
    overwrite_columns=["day", "client"],
    lineage={"source_type": "manual", "source_id": "my-job"},
)

# Read
dr = DataReader(
    super_name="example",
    organization="my-org",
    query="SELECT day, sum(value) FROM facts GROUP BY day LIMIT 10",
)
df, status, message = dr.execute(role_name="superadmin", engine=engine.AUTO)

Demos

The package ships two runnable demos under supertable.demo:

# Numbered tutorial — runs the full lifecycle end-to-end.
supertable-demo-quickstart
# or
python -m supertable.demo.quickstart

# Synthetic webshop dataset.
supertable-demo-webshop-generate    # build ~1.2M rows on disk
supertable-demo-webshop-load        # load them into SuperTable
supertable-demo-webshop-topup       # continuous incremental refresh

Both demos are also runnable as module steps. Examples:

python -m supertable.demo.quickstart.s01_01_01_create_super_table
python -m supertable.demo.quickstart.s03_08_read_snapshot_history
python -m supertable.demo.webshop.generate

See supertable/demo/README.md for the full script index.

What's included

Versioned tables with snapshot isolation, upsert (overwrite_columns), soft deletes (delete_only=True), schema evolution, and staleness filtering
DuckDB query engine — embedded, zero-copy reads from object storage
Spark SQL via Thrift — for queries exceeding DuckDB memory limits
RBAC — role types (superadmin, admin, writer, reader, meta) with row-level and column-level security enforced through view chains
Audit logging — tamper-evident SHA-256 hash chain in Redis Streams with Parquet export
Monitoring — MonitoringWriter pushes read/write/metric payloads to Redis lists; structured JSON logging with correlation IDs
Ingestion — staging areas (Staging) and automated ingestion pipes (SuperPipe)
Mirroring — optional Delta Lake / Iceberg / Parquet export after every write
Snapshot history — every write chains to previous_snapshot, enabling point-in-time inspection without separate historical tables

Documentation

See docs/00_index.md for the full table of contents.

#	Document	Description
01	Platform Overview	Architecture, package layout, deployment, data flow
02	Configuration	Environment variables and runtime settings
03	Data Model	Organization → SuperTable → SimpleTable hierarchy
04	Storage Backends	StorageInterface, S3, MinIO, Azure, GCP, local
05	Redis Catalog	Metadata store, key naming, operations, CAS
06	Data Writer	Write pipeline, locking, dedup, tombstones
07	Ingestion & Pipes	Staging areas, automated ingestion pipes
08	Distributed Locking	Redis locks, file locks, deadlock prevention
09	Query Engine	DuckDB Lite/Pro, Spark SQL, auto selection
10	Data Reader	Read facade, snapshot history, view chain
11	RBAC & Access Control	Roles, users, row/column security
12	Audit Logging	SHA-256 hash chain, DORA/SOC 2, SIEM
13	Table Mirroring	Delta Lake, Iceberg, Parquet export
14	Monitoring	Metrics writer, structured logging
15	Python SDK	Core classes, demos, example index

License

Super Table Public Use License (STPUL) v1.0 — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

2.1.1

May 27, 2026

2.1.0

May 27, 2026

2.0.9

May 27, 2026

2.0.8

May 27, 2026

2.0.7

May 27, 2026

2.0.5

May 27, 2026

2.0.4

May 17, 2026

2.0.3

May 10, 2026

2.0.2

May 6, 2026

2.0.1

May 5, 2026

This version

2.0.0

May 5, 2026

1.9.5

Mar 16, 2026

1.9.4

Mar 12, 2026

1.9.0

Mar 11, 2026

1.8.9

Mar 10, 2026

1.8.5

Mar 1, 2026

1.8.1

Mar 1, 2026

1.8.0

Mar 1, 2026

1.7.0

Feb 26, 2026

1.6.7

Feb 12, 2026

1.6.6

Feb 12, 2026

1.6.5

Feb 11, 2026

1.6.2

Feb 10, 2026

1.5.6

Jan 30, 2026

1.5.5

Dec 16, 2025

1.5.4

Dec 16, 2025

1.5.3

Dec 15, 2025

1.5.2

Dec 8, 2025

1.5.1

Nov 17, 2025

1.5.0

Nov 15, 2025

1.4.0

Oct 31, 2025

1.3.3

Oct 22, 2025

1.3.2

Oct 22, 2025

1.3.1

Oct 21, 2025

1.3.0

Oct 18, 2025

1.2.48

Sep 27, 2025

1.2.0

Sep 14, 2025

1.1.0

May 12, 2025

1.0.0

Apr 23, 2025

0.1.0

Apr 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

supertable-2.0.0.tar.gz (401.7 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

supertable-2.0.0-py3-none-any.whl (468.1 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file supertable-2.0.0.tar.gz.

File metadata

Download URL: supertable-2.0.0.tar.gz
Upload date: May 5, 2026
Size: 401.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for supertable-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`ae3fa2dc8d3e1b9f687c888dea0ebc619f7bfacd118e035fa14f6d2ac690d583`
MD5	`b671b67a3d67c3ab0f7260a83e0d55a0`
BLAKE2b-256	`aab76c5dea0256dc18146f86ac2762fcc76007ce08474f7aff82194442e593e0`

See more details on using hashes here.

File details

Details for the file supertable-2.0.0-py3-none-any.whl.

File metadata

Download URL: supertable-2.0.0-py3-none-any.whl
Upload date: May 5, 2026
Size: 468.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for supertable-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`328d9109b2e96419f28e1f9c06c6d3cfc11775622ef0860f8e0929ff5bda8bcc`
MD5	`4866f9e6f00955f7a0d6b812120d5d8d`
BLAKE2b-256	`e4b8ae40623c9e054215eb34c968ef0c223a2276196b3640c3a01ad8881bfb75`

See more details on using hashes here.

supertable 2.0.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

SuperTable

Installation

Architecture

Quick example

Demos

What's included

Documentation

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes