Quantum-Safe Columnar Storage Format with row-granular lazy decryption

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

QPQT - Quantum-Safe Columnar Storage Format

A purpose-built binary columnar file format (.qpqt) with native post-quantum cryptography and row-granular lazy decryption, a capability no existing columnar format offers.

Cryptographic stack: ML-KEM-768 (FIPS 203) + HKDF-SHA3-256 + AES-256-GCM (FIPS 197)

Quick Start

# 1. Install dependencies (liboqs + OpenSSL)
bash scripts/install_deps.sh

# 2. Build
mkdir build && cd build && cmake .. && make -j$(nproc) && cd ..

# 3. Generate a quantum-safe keypair
./build/qpqt keygen --out-pub pub.bin --out-sec sec.bin

# 4. Encrypt a CSV - ssn and dob become quantum-safe encrypted columns
./build/qpqt encrypt \
    --input customers.csv \
    --pqc-columns ssn,dob \
    --pub-key pub.bin \
    --output customers.qpqt

# 5. Inspect the file (no keys required - safe to run anywhere)
./build/qpqt inspect --input customers.qpqt

# 6. Decrypt for authorized users (lazy - only matching rows decrypted)
./build/qpqt decrypt \
    --input customers.qpqt \
    --sec-key sec.bin \
    --where "customer_id=12345" \
    --output result.csv

Python:

import qpqt, pandas as pd

pub, sec = qpqt.keygen()
kid = qpqt.generate_key_id()

# Write
w = qpqt.Writer("customers.qpqt",
                column_names=["id", "state", "ssn"],
                column_types=["int32", "string", "string"],
                pqc_columns=["ssn"],
                public_key=pub, key_id=kid)
w.write_batch({"id":[1,2,3], "state":["CA","NY","TX"], "ssn":["111","222","333"]}, 3)
w.close()

# Read - lazy decryption, only matching rows decrypted
r = qpqt.Reader("customers.qpqt")
r.set_secret_key(sec)
df = pd.DataFrame(r.query(where={"id": 2}))

The Problem

Enterprises face a dual mandate: regulatory pressure to adopt post-quantum cryptography (CNSA 2.0, NIST FIPS 203, deadline 2035) and the need to maintain query performance on large-scale columnar data warehouses.

The naive approach - applying ML-KEM-768 at the row level - costs 9,600ms for 1M rows even with 4-core parallelization. That establishes the upper bound of the problem: PQC done wrong is unusable at analytical query scale.

The Solution

QPQT redesigns the storage format around PQC cost:

Hybrid KEM construction - ML-KEM-768 is used once per 4,096-row page to encapsulate an AES-256-GCM page key. This reduces KEM operations from 1M to 250 per million rows.
Fully separated column sections - structural (unencrypted) and PQC columns are physically isolated on disk at 4KB OS page boundaries. Predicates run on structural columns without loading the PQC section into CPU cache.
Row-granular lazy decryption - predicates execute on cheap structural columns first. Only the individual rows that survive the predicate trigger KEM decapsulation and AES-GCM decryption.
O(1) manifest lookup - a flat crypto manifest in the footer maps any row to its page key via pointer arithmetic.

Performance - Honest Three-Baseline Comparison

Benchmarked on Kaggle Xeon CPU (4 cores), 1M rows, real ML-KEM-768 + AES-256-GCM.

Two baselines are measured, not estimated:

Naive per-row PQC - row-level ML-KEM encapsulation. Establishes the upper bound of the problem. This is what a quick liboqs integration produces.
Competent per-page PQC - the correct hybrid KEM construction (per-page ML-KEM + AES-GCM, exactly like QPQT) but stored in a plain layout with no column separation and no lazy decryption. It decrypts every row in the queried column because decryption is chunk-granular. This isolates QPQT's actual contribution.

Selectivity	Naive per-row	Competent per-page	QPQT	QPQT vs competent
1%	9,600ms	2,150ms	78ms	27.6x
5%	9,600ms	2,111ms	163ms	12.9x
10%	9,600ms	2,113ms	264ms	8.0x
25%	9,600ms	2,103ms	557ms	3.8x
50%	9,600ms	2,148ms	1,055ms	2.0x
100%	9,600ms	2,147ms	2,098ms	1.02x (no advantage)

Reading this table honestly:

QPQT's contribution is row-granular lazy decryption. At low selectivity - the common case for analytical queries - it decrypts orders of magnitude fewer rows than a competent columnar-unaware implementation, giving 8-27x.

As selectivity approaches 100%, the advantage shrinks to parity: when every row survives the predicate, QPQT and the competent baseline do identical work. At 100% selectivity QPQT offers no advantage over competent per-page PQC - and that is expected, because there is nothing to skip.

The win is real precisely where real queries live: selective filters on large tables. It is not a universal speedup, and the methodology isolates exactly what QPQT adds versus what any competent PQC implementation would already do.

Other measured numbers:

Metric	Value
Write throughput (1M rows)	534K rows/sec (1,871ms)
Structural scan (no crypto)	5ms, 188M rows/sec
File size (1M rows)	80MB
Storage vs naive per-row ML-KEM	80MB vs ~1,084MB (92% reduction)

Cryptographic Design

ML-KEM-768 keypair  ->  secret key stored in KMS (file holds only key_id)
                                |
                        Per page (4,096 rows):
                        ML-KEM-768 encapsulate(public_key)
                            |-- kem_ciphertext  ->  CRYPTO MANIFEST
                            +-- shared_secret (32 bytes)
                                        |
                                HKDF-SHA3-256(shared_secret, page_context)
                                        +-- aes_page_key (32 bytes, unique per page)
                                                    |
                                            AES-256-GCM per row
                                            |-- IV (12B, deterministic)
                                            |-- ciphertext (= plaintext length)
                                            +-- auth_tag (16B, tamper detection)

IV construction and GCM nonce safety

QPQT uses deterministic AES-GCM IVs. This is safe because nonce uniqueness is guaranteed within every key scope. Each 4,096-row page derives its own unique AES-256 key via ML-KEM encapsulation + HKDF-SHA3-256. The IV only needs to be unique under a given key, and within a single page key the (row_index, column_index) tuple is unique by construction. The file_uuid component prevents cross-file collision in the event a page key is ever reused across files. There is no nonce reuse under any single key - the failure mode that breaks GCM does not occur.

All components are NIST-approved and quantum-safe:

ML-KEM-768: FIPS 203 (replaces RSA/ECDH for key establishment)
AES-256-GCM: FIPS 197 (quantum-safe symmetrically; Grover's only halves the effective key strength, leaving 128-bit security)
HKDF-SHA3-256: SP 800-56C

Why a Separate Format (and not Parquet)?

A reasonable question: Parquet already has Modular Encryption - why not derive its AES key from ML-KEM and get quantum-safe Parquet today?

For encryption alone, you could. Parquet Modular Encryption does per-column AES-GCM and you could wrap the key with ML-KEM. The encryption is not the contribution.

The contribution is row-granular lazy decryption. Parquet does support predicate pushdown and can skip entire encrypted column chunks or row groups via footer statistics - that is real and valuable. What it cannot do is decrypt only the surviving rows within a chunk that the predicate did not eliminate wholesale. Parquet decrypts at chunk granularity, not surviving-row granularity. Closing that specific gap is what requires a format where structural columns are physically separated (so the predicate runs before any decryption) and where a manifest addresses individual rows' page keys.

QPQT is a purpose-built format for organizations that need PQC-protected columnar data with row-granular lazy decryption. Existing tools integrate via the CLI, Python bindings, and Arrow export rather than reading .qpqt natively.

File Format

+-----------------------------------------------------+
| FILE HEADER (48 bytes)                              |
| magic + version + file_uuid + total_rows + offsets  |
+-----------------------------------------------------+
| SCHEMA BLOCK (variable)                             |
+-----------------------------------------------------+
| KEY REFERENCE BLOCK (32 bytes) - key_id, not the key|
+-----------------------------------------------------+
| ROW GROUP 0  (100,000 rows)                         |
|  |-- SECTION 1: Structural columns (unencrypted)    |
|  |   [tightly packed, padded to 4KB boundary]       |
|  +-- SECTION 2: PQC columns (AES-256-GCM per row)   |
|      [starts on 4KB OS page boundary]               |
+-----------------------------------------------------+
| ROW GROUP 1 ... N                                   |
+-----------------------------------------------------+
| FILE FOOTER                                         |
|  |-- Row group offset table                         |
|  |-- CRYPTO MANIFEST (flat array, O(1) lookup)      |
|  +-- FOOTER HEADER (40 bytes) + CRC32               |
+-----------------------------------------------------+

Key Management

./qpqt keygen --out-pub pub.bin --out-sec sec.bin

pub.bin - ML-KEM-768 public key (1184 bytes). Safe to share with writers.
sec.bin - ML-KEM-768 secret key (2400 bytes). Never share. Never commit.
pub.bin.keyid - 16-byte key ID. Pass to --key-id when encrypting.

Environment	Recommended key storage
Local dev	Outside repo, e.g. `~/.qpqt/keys/`
AWS	AWS KMS + Secrets Manager
Azure	Azure Key Vault
GCP	Cloud KMS
Databricks	`dbutils.secrets`
On-premise	HashiCorp Vault or HSM

QPQT stores a key_id reference in the file header, not the key itself, so key rotation never requires rewriting existing data files.

Build

Prerequisites

Ubuntu 22.04 or Debian 12
CMake 3.16+, OpenSSL 3.x, C++17 compiler with OpenMP

Steps

bash scripts/install_deps.sh        # installs liboqs from source
mkdir build && cd build
cmake .. && make -j$(nproc)
./qpqt_tests

Ecosystem Integration

Tool	How
CLI	`qpqt encrypt/decrypt/inspect` on CSV (Parquet with Arrow build)
Python / pandas	`pip install .` then `import qpqt`
DuckDB / Polars / Spark	`qpqt_arrow export` produces structural columns as Arrow IPC

License

MIT

Author

Rohan Prabhakar

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

prabhakarroh

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.0

Jun 4, 2026

0.2.0

Jun 3, 2026

0.1.1

Jun 2, 2026

This version

0.1.0

Jun 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

qpqt-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl (4.0 MB view details)

Uploaded Jun 2, 2026 CPython 3.12manylinux: glibc 2.28+ x86-64

qpqt-0.1.0-cp311-cp311-manylinux_2_28_x86_64.whl (4.0 MB view details)

Uploaded Jun 2, 2026 CPython 3.11manylinux: glibc 2.28+ x86-64

qpqt-0.1.0-cp310-cp310-manylinux_2_28_x86_64.whl (4.0 MB view details)

Uploaded Jun 2, 2026 CPython 3.10manylinux: glibc 2.28+ x86-64

qpqt-0.1.0-cp39-cp39-manylinux_2_28_x86_64.whl (4.0 MB view details)

Uploaded Jun 2, 2026 CPython 3.9manylinux: glibc 2.28+ x86-64

qpqt-0.1.0-cp38-cp38-manylinux_2_28_x86_64.whl (4.0 MB view details)

Uploaded Jun 2, 2026 CPython 3.8manylinux: glibc 2.28+ x86-64

File details

Details for the file qpqt-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

Download URL: qpqt-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl
Upload date: Jun 2, 2026
Size: 4.0 MB
Tags: CPython 3.12, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for qpqt-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`645c7d501ff5dc0aec5e06a3a3a920c8758aadeab666d4079decc5c74511945a`
MD5	`83ebf18158cfe24f0ec7794b2284429b`
BLAKE2b-256	`8c68ba114ef45e681cda5f9c8592ff0a11071102b6446ca3ab61980ae99ce857`

See more details on using hashes here.

Provenance

The following attestation bundles were made for qpqt-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on Rohan-Prabhakar/QPQT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: qpqt-0.1.0-cp312-cp312-manylinux_2_28_x86_64.whl
- Subject digest: 645c7d501ff5dc0aec5e06a3a3a920c8758aadeab666d4079decc5c74511945a
- Sigstore transparency entry: 1704239995
- Sigstore integration time: Jun 2, 2026
Source repository:
- Permalink: Rohan-Prabhakar/QPQT@dd924c4ed107e6c631cd947a5cd0f46399f03467
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Rohan-Prabhakar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: wheels.yml@dd924c4ed107e6c631cd947a5cd0f46399f03467
- Trigger Event: push

File details

Details for the file qpqt-0.1.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

Download URL: qpqt-0.1.0-cp311-cp311-manylinux_2_28_x86_64.whl
Upload date: Jun 2, 2026
Size: 4.0 MB
Tags: CPython 3.11, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for qpqt-0.1.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`ab9e30a55fd97c8c79e477b59a471d4b2e04cfaa58ebef0a11de43393ab07994`
MD5	`0f83461e1cf42f9b8af9c6235126dff1`
BLAKE2b-256	`6c4ea85b15db96d79199393d7085a797e06ba80f0d6f784668206c0bdfd8d911`

See more details on using hashes here.

Provenance

The following attestation bundles were made for qpqt-0.1.0-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on Rohan-Prabhakar/QPQT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: qpqt-0.1.0-cp311-cp311-manylinux_2_28_x86_64.whl
- Subject digest: ab9e30a55fd97c8c79e477b59a471d4b2e04cfaa58ebef0a11de43393ab07994
- Sigstore transparency entry: 1704239960
- Sigstore integration time: Jun 2, 2026
Source repository:
- Permalink: Rohan-Prabhakar/QPQT@dd924c4ed107e6c631cd947a5cd0f46399f03467
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Rohan-Prabhakar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: wheels.yml@dd924c4ed107e6c631cd947a5cd0f46399f03467
- Trigger Event: push

File details

Details for the file qpqt-0.1.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

Download URL: qpqt-0.1.0-cp310-cp310-manylinux_2_28_x86_64.whl
Upload date: Jun 2, 2026
Size: 4.0 MB
Tags: CPython 3.10, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for qpqt-0.1.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`9489fcbbdb2a8b955e36f0798acdff193caea654ac3c6a8411e873c0abcb231a`
MD5	`51387d6031a59b871b49350f645bef6a`
BLAKE2b-256	`bf03aa481d102cfa49aace165fa6cbff6b790f26bc0d98c42d2a5d179bb6d3e9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for qpqt-0.1.0-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on Rohan-Prabhakar/QPQT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: qpqt-0.1.0-cp310-cp310-manylinux_2_28_x86_64.whl
- Subject digest: 9489fcbbdb2a8b955e36f0798acdff193caea654ac3c6a8411e873c0abcb231a
- Sigstore transparency entry: 1704239977
- Sigstore integration time: Jun 2, 2026
Source repository:
- Permalink: Rohan-Prabhakar/QPQT@dd924c4ed107e6c631cd947a5cd0f46399f03467
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Rohan-Prabhakar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: wheels.yml@dd924c4ed107e6c631cd947a5cd0f46399f03467
- Trigger Event: push

File details

Details for the file qpqt-0.1.0-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

Download URL: qpqt-0.1.0-cp39-cp39-manylinux_2_28_x86_64.whl
Upload date: Jun 2, 2026
Size: 4.0 MB
Tags: CPython 3.9, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for qpqt-0.1.0-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`df21de8736643e5f85d114768c85f07d618953824b7868be4e5860d3e2cccd06`
MD5	`9ad96acbbbc9e80e9b99b983e66e8996`
BLAKE2b-256	`78d1cdd8d884c8b926777eec2302bd54beb340009ef198329ebfd73eb1308339`

See more details on using hashes here.

Provenance

The following attestation bundles were made for qpqt-0.1.0-cp39-cp39-manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on Rohan-Prabhakar/QPQT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: qpqt-0.1.0-cp39-cp39-manylinux_2_28_x86_64.whl
- Subject digest: df21de8736643e5f85d114768c85f07d618953824b7868be4e5860d3e2cccd06
- Sigstore transparency entry: 1704239938
- Sigstore integration time: Jun 2, 2026
Source repository:
- Permalink: Rohan-Prabhakar/QPQT@dd924c4ed107e6c631cd947a5cd0f46399f03467
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Rohan-Prabhakar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: wheels.yml@dd924c4ed107e6c631cd947a5cd0f46399f03467
- Trigger Event: push

File details

Details for the file qpqt-0.1.0-cp38-cp38-manylinux_2_28_x86_64.whl.

File metadata

Download URL: qpqt-0.1.0-cp38-cp38-manylinux_2_28_x86_64.whl
Upload date: Jun 2, 2026
Size: 4.0 MB
Tags: CPython 3.8, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for qpqt-0.1.0-cp38-cp38-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`b8ec899749f8f60595259ec22c1ada1a2e9455e7949d7291866945c86b370e72`
MD5	`f1ddc0617c1ad06fc125ce8a571a9f53`
BLAKE2b-256	`c78372b1b275f6418c6134cc48981be9cd85d802e7de62d61d345adc572943d2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for qpqt-0.1.0-cp38-cp38-manylinux_2_28_x86_64.whl:

Publisher: wheels.yml on Rohan-Prabhakar/QPQT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: qpqt-0.1.0-cp38-cp38-manylinux_2_28_x86_64.whl
- Subject digest: b8ec899749f8f60595259ec22c1ada1a2e9455e7949d7291866945c86b370e72
- Sigstore transparency entry: 1704239913
- Sigstore integration time: Jun 2, 2026
Source repository:
- Permalink: Rohan-Prabhakar/QPQT@dd924c4ed107e6c631cd947a5cd0f46399f03467
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Rohan-Prabhakar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: wheels.yml@dd924c4ed107e6c631cd947a5cd0f46399f03467
- Trigger Event: push

qpqt 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

QPQT - Quantum-Safe Columnar Storage Format

Quick Start

The Problem

The Solution

Performance - Honest Three-Baseline Comparison

Cryptographic Design

IV construction and GCM nonce safety

Why a Separate Format (and not Parquet)?

File Format

Key Management

Build

Prerequisites

Steps

Ecosystem Integration

License

Author

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance