Skip to main content

High-performance Delta Lake query engine for Python

Project description

deltaFusion

High-performance Delta Lake query engine for Python, powered by Rust.

License: MIT

Overview

deltaFusion provides SQL query capabilities over Delta Lake tables using DataFusion as the query engine. Data is transferred via zero-copy Apache Arrow for maximum performance.

Key Features

  • SQL Queries: Full SQL support via DataFusion
  • Zero-Copy Transfer: Arrow-based data exchange with Python (no serialization overhead)
  • Time Travel: Query specific table versions
  • Storage Support: Local filesystem and S3-compatible storage
  • GIL Release: Python threads remain active during Rust operations

Installation

# From source (requires Rust toolchain and maturin)
pip install maturin
maturin develop --release

Quick Start

from delta_fusion import DeltaEngine

# Create engine
engine = DeltaEngine()

# Register a Delta table
engine.register_table("sales", "/path/to/delta/table")

# Query with SQL (returns PyArrow RecordBatches)
batches = engine.query("SELECT * FROM sales WHERE year = 2024")

# Convert to pandas
import pyarrow as pa
table = pa.Table.from_batches(batches)
df = table.to_pandas()

# Or get results as list of dicts (for small datasets)
rows = engine.query_to_dicts("SELECT * FROM sales LIMIT 10")

S3 Storage

# Option 1: Environment variables
# AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION

# Option 2: Explicit configuration
engine = DeltaEngine({
    "aws_access_key_id": "...",
    "aws_secret_access_key": "...",
    "aws_region": "us-east-1",
    "aws_endpoint": "http://localhost:9000",  # For MinIO
    "aws_allow_http": True
})

engine.register_table("data", "s3://bucket/delta-table")

Time Travel

# Query a specific version
engine.register_table("sales", "/path/to/table", version=5)

# Get table metadata
info = engine.table_info("/path/to/table")
print(f"Version: {info['version']}, Files: {info['num_files']}")

Development

# Build for development
maturin develop

# Run Rust tests
cargo test --no-default-features

# Run Python tests
pytest

# Check and format
cargo check
cargo clippy
cargo fmt

Architecture

src/
├── lib.rs       # Module entry point
├── config.rs    # Storage configuration
├── error.rs     # Error types
├── engine.rs    # DeltaEngine (DataFusion integration)
└── python.rs    # PyO3 bindings

python/delta_fusion/
├── __init__.py  # Python API
└── __init__.pyi # Type stubs

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

delta_fusion-1.0.0.tar.gz (82.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

delta_fusion-1.0.0-cp312-cp312-manylinux_2_35_x86_64.whl (26.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.35+ x86-64

delta_fusion-1.0.0-cp312-cp312-macosx_11_0_arm64.whl (22.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

delta_fusion-1.0.0-cp311-cp311-manylinux_2_35_x86_64.whl (26.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.35+ x86-64

delta_fusion-1.0.0-cp311-cp311-macosx_11_0_arm64.whl (22.9 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file delta_fusion-1.0.0.tar.gz.

File metadata

  • Download URL: delta_fusion-1.0.0.tar.gz
  • Upload date:
  • Size: 82.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for delta_fusion-1.0.0.tar.gz
Algorithm Hash digest
SHA256 19d3792e0e62e01146fd3bcf4b9a5fc2c69171978d360b957c855c1ccf8a42eb
MD5 4218b4c0b3547438df6592d6910a031c
BLAKE2b-256 d494bb471ff3e6bbf7d64d1364a96edc5374e9387dbcb925938adadb1300d5a4

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-1.0.0.tar.gz:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-1.0.0-cp312-cp312-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for delta_fusion-1.0.0-cp312-cp312-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 fc88ae25d6035bbc4ae3a0afaa36d236808530508cad8fdf3e066e1c0f8758ef
MD5 5cb0e1fdcfe3920aee5ca906ab4d5b87
BLAKE2b-256 2f847774f051253b70a45ebd79fdac7caaae996fedef5bfb0afa8cf9e5ab07a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-1.0.0-cp312-cp312-manylinux_2_35_x86_64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-1.0.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for delta_fusion-1.0.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2e4e3fa2d03ec9a41d8278b2bd32f7c14697f2bc4a2150e39225d648c2ba0ddf
MD5 516b25e32c258d792ebbc70702c09b81
BLAKE2b-256 84f7e1ca529f93a8138e52d26c4e5d26694d44b2da963a28b5c1446731ddfcf6

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-1.0.0-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-1.0.0-cp311-cp311-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for delta_fusion-1.0.0-cp311-cp311-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 8028d581e06716bb179874c670e4b839a28933670104d7d19a40a1a6d5034869
MD5 489c11e23e77429d378099201e5da70f
BLAKE2b-256 2889810ad95256d13f1a60a8dc91ea4c291365f8fa358d0952d5397a67bb3aa0

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-1.0.0-cp311-cp311-manylinux_2_35_x86_64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-1.0.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for delta_fusion-1.0.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e86fa8d404356d4366fce4046bb433bc4acffde6eda004927ffcb5803432488f
MD5 f83a92941a5ebd9282688ce1de8d3271
BLAKE2b-256 a25a5cb443f72f182a5b4258f0cebdbe1c5d7e0bc5d064e359ede170f410463a

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-1.0.0-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page