Skip to main content

High-performance Delta Lake query engine for Python

Project description

deltaFusion

High-performance Delta Lake query engine for Python, powered by Rust.

License: MIT

Overview

deltaFusion provides SQL query capabilities over Delta Lake tables using DataFusion as the query engine. Data is transferred via zero-copy Apache Arrow for maximum performance.

Key Features

  • SQL Queries: Full SQL support via DataFusion
  • Zero-Copy Transfer: Arrow-based data exchange with Python (no serialization overhead)
  • Time Travel: Query specific table versions
  • Storage Support: Local filesystem and S3-compatible storage
  • GIL Release: Python threads remain active during Rust operations

Installation

# From source (requires Rust toolchain and maturin)
pip install maturin
maturin develop --release

Quick Start

from delta_fusion import DeltaEngine

# Create engine
engine = DeltaEngine()

# Register a Delta table
engine.register_table("sales", "/path/to/delta/table")

# Query with SQL (returns PyArrow RecordBatches)
batches = engine.query("SELECT * FROM sales WHERE year = 2024")

# Convert to pandas
import pyarrow as pa
table = pa.Table.from_batches(batches)
df = table.to_pandas()

# Or get results as list of dicts (for small datasets)
rows = engine.query_to_dicts("SELECT * FROM sales LIMIT 10")

S3 Storage

# Option 1: Environment variables
# AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION

# Option 2: Explicit configuration
engine = DeltaEngine({
    "aws_access_key_id": "...",
    "aws_secret_access_key": "...",
    "aws_region": "us-east-1",
    "aws_endpoint": "http://localhost:9000",  # For MinIO
    "aws_allow_http": True
})

engine.register_table("data", "s3://bucket/delta-table")

Time Travel

# Query a specific version
engine.register_table("sales", "/path/to/table", version=5)

# Get table metadata
info = engine.table_info("/path/to/table")
print(f"Version: {info['version']}, Files: {info['num_files']}")

Development

# Build for development
maturin develop

# Run Rust tests
cargo test --no-default-features

# Run Python tests
pytest

# Check and format
cargo check
cargo clippy
cargo fmt

Architecture

src/
├── lib.rs       # Module entry point
├── config.rs    # Storage configuration
├── error.rs     # Error types
├── engine.rs    # DeltaEngine (DataFusion integration)
└── python.rs    # PyO3 bindings

python/delta_fusion/
├── __init__.py  # Python API
└── __init__.pyi # Type stubs

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

delta_fusion-1.0.3.tar.gz (88.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

delta_fusion-1.0.3-cp312-cp312-manylinux_2_39_x86_64.whl (35.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.39+ x86-64

delta_fusion-1.0.3-cp312-cp312-macosx_11_0_arm64.whl (29.1 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

delta_fusion-1.0.3-cp311-cp311-manylinux_2_39_x86_64.whl (35.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.39+ x86-64

delta_fusion-1.0.3-cp311-cp311-macosx_11_0_arm64.whl (29.1 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file delta_fusion-1.0.3.tar.gz.

File metadata

  • Download URL: delta_fusion-1.0.3.tar.gz
  • Upload date:
  • Size: 88.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for delta_fusion-1.0.3.tar.gz
Algorithm Hash digest
SHA256 4595eb557c00e76938b535700a3d9f09e384bf730a512b2c63204576f3e3489a
MD5 e2393c3d5d79a1d5b3089d6a149615ad
BLAKE2b-256 a6ae985bfdbe2c17d2bb9f29c5920c9fb1e813e79838c72f0e591bd18c80b75c

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-1.0.3.tar.gz:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-1.0.3-cp312-cp312-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for delta_fusion-1.0.3-cp312-cp312-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 016219498fb9806c4d33095a68c94e47eedbbd240d7cfd54f939e1d70c9b0219
MD5 f73b3f09fba9a11bcc9f26ed81c831b3
BLAKE2b-256 14511a693f4b5087589ed79e8c82f1dba48789bf2e8cf51c7a6ba61633e4e76f

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-1.0.3-cp312-cp312-manylinux_2_39_x86_64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-1.0.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for delta_fusion-1.0.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 46c04e392c7d31c1cdf69ded86be52f4017dd4e5c5e491bdf2d43b267e80c92d
MD5 7b5a49fbc2f5701a94a35e4324865a3f
BLAKE2b-256 e18661d61e5d0ba58ea978ebf57ea05edc3719bac6dad192ca51196bd2bc9287

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-1.0.3-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-1.0.3-cp311-cp311-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for delta_fusion-1.0.3-cp311-cp311-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 d64dd6d1fe5e181a01c3301723fd6cd1aac345d2c9e1e484745b54d1b8e183e1
MD5 f58dceb4994f5bd10b4f907b3b4a4a77
BLAKE2b-256 8c879973d807c7917c978103ca3ab6b4d95e2ed31224e35b42867c58ff1f5199

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-1.0.3-cp311-cp311-manylinux_2_39_x86_64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-1.0.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for delta_fusion-1.0.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9afd352ff37a760050dc2f268764808ea3e2caa7712697b141b54eaafae09880
MD5 fb1a0292b8f1be7a7c3eeaf79c4cfae7
BLAKE2b-256 dc81ed13aa92461be1592fb988b628d67d4be2aa82ada85329c41f0f2b7c8b44

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-1.0.3-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page