Blazingly fast DataFrame library with Parquet encryption support (AES-256-GCM), not production ready
Project description
polars-parquet-encrypt
Blazingly fast DataFrame library with Parquet encryption support
This package is a full replacement for Polars with built-in AES-256-GCM page-level encryption for Parquet files.
⚠️ Not production ready - This is a test/research package
Why This Package?
The official PyPI polars package doesn't include encryption support. This package provides:
- ✅ Full Polars functionality - Everything from standard Polars
- ✅ Encryption built-in - No need to build from source
- ✅ Drop-in replacement - Just
pip installand use
Installation
pip install polars-parquet-encrypt
That's it! No Rust toolchain, no maturin, no source builds required.
Usage
Basic Encryption/Decryption
import polars as pl
import os
# Generate 32-byte key for AES-256
key = os.urandom(32)
# Write encrypted parquet file
df = pl.DataFrame({
"id": [1, 2, 3, 4, 5],
"name": ["Alice", "Bob", "Charlie", "David", "Eve"],
"salary": [50000, 60000, 75000, 80000, 95000]
})
df.write_parquet("encrypted.parquet", encryption_key=key)
# Read encrypted parquet file
df_read = pl.read_parquet("encrypted.parquet", encryption_key=key)
print(df_read)
Lazy Scanning with Encryption
# Lazy scan with encryption
lf = pl.scan_parquet("encrypted.parquet", encryption_key=key)
result = lf.filter(pl.col("salary") > 70000).collect()
print(result)
Cloud Storage (Azure, S3, etc.)
# Works with cloud storage too
storage_options = {"account_name": "myaccount"}
df.write_parquet(
"abfs://container/encrypted.parquet",
encryption_key=key,
storage_options=storage_options
)
df_read = pl.read_parquet(
"abfs://container/encrypted.parquet",
encryption_key=key,
storage_options=storage_options
)
Security Features
Encryption
- Algorithm: AES-256-GCM (authenticated encryption)
- Key size: Exactly 32 bytes (256 bits)
- Nonce: Unique 12-byte random nonce per page
- Authentication tag: 16-byte GCM tag for integrity
- Format:
[nonce(12) | ciphertext | tag(16)]per page
What's Encrypted
- ✅ Data pages: All column values encrypted
- ✅ Dictionary pages: Dictionary-encoded values encrypted
- ❌ Footer metadata: Schema, row counts, column names remain plaintext
Performance
Optimizations
- Encryption context created once per column chunk (not per page)
- In-place decryption using
decrypt_in_place_detached() - Scratch buffer reused across all pages in column chunk
- Zero-copy plaintext extraction with
split_off()
Platform Support
Pre-built wheels available for:
- macOS: ARM64 (Apple Silicon), x86_64 (Intel)
- Linux: x86_64, ARM64 (aarch64)
- Python: 3.10, 3.11, 3.12+
Requirements
- Python: >= 3.10
- Encryption key: Exactly 32 bytes for AES-256
License
MIT License - see LICENSE file for details.
Building from Source
Pre-built wheels are available on PyPI, but if you need to build from source:
macOS (Current Platform)
./quick-build.sh
Linux (Without Docker)
See BUILD-LINUX.md for complete instructions, or:
# On your Linux machine
./build-linux-native.sh
Quick reference: QUICK-START-LINUX.md
All Platforms
See BUILD.md for comprehensive build documentation.
Acknowledgments
Built on Polars - blazingly fast DataFrames in Rust and Python.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polars_parquet_encrypt-0.2.0-cp310-abi3-manylinux_2_39_x86_64.whl.
File metadata
- Download URL: polars_parquet_encrypt-0.2.0-cp310-abi3-manylinux_2_39_x86_64.whl
- Upload date:
- Size: 47.4 MB
- Tags: CPython 3.10+, manylinux: glibc 2.39+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ccf983b7359249afb539ae0201d96f818f294f5e74157095d27834803e52ba65
|
|
| MD5 |
32049983492bf284a3e173f894ac7ea7
|
|
| BLAKE2b-256 |
8961e43618c2788d285a45512002f5269adb6e00a1b2ce3e99bc85d2dd01b92c
|
File details
Details for the file polars_parquet_encrypt-0.2.0-cp310-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: polars_parquet_encrypt-0.2.0-cp310-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 43.0 MB
- Tags: CPython 3.10+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
99eed5700fb2542cbf1320a70c8bf99531f383173224878688b69a56af4f7bfe
|
|
| MD5 |
d41ca7cfb216fefa6cb33f6bc4bdbef6
|
|
| BLAKE2b-256 |
d5a88818ae24af0aaed181ab36531185c13bc752eadca0c00dce2e8859ad147d
|