Skip to main content

High-performance Delta Lake query engine for Python

Project description

deltaFusion

High-performance Delta Lake query engine for Python, powered by Rust.

License: MIT

Overview

deltaFusion provides SQL query capabilities over Delta Lake tables using DataFusion as the query engine. Data is transferred via zero-copy Apache Arrow for maximum performance.

Key Features

  • SQL Queries: Full SQL support via DataFusion
  • Zero-Copy Transfer: Arrow-based data exchange with Python (no serialization overhead)
  • Time Travel: Query specific table versions
  • Storage Support: Local filesystem and S3-compatible storage
  • GIL Release: Python threads remain active during Rust operations

Installation

# From source (requires Rust toolchain and maturin)
pip install maturin
maturin develop --release

Quick Start

from delta_fusion import DeltaEngine

# Create engine
engine = DeltaEngine()

# Register a Delta table
engine.register_table("sales", "/path/to/delta/table")

# Query with SQL (returns PyArrow RecordBatches)
batches = engine.query("SELECT * FROM sales WHERE year = 2024")

# Convert to pandas
import pyarrow as pa
table = pa.Table.from_batches(batches)
df = table.to_pandas()

# Or get results as list of dicts (for small datasets)
rows = engine.query_to_dicts("SELECT * FROM sales LIMIT 10")

S3 Storage

# Option 1: Environment variables
# AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION

# Option 2: Explicit configuration
engine = DeltaEngine({
    "aws_access_key_id": "...",
    "aws_secret_access_key": "...",
    "aws_region": "us-east-1",
    "aws_endpoint": "http://localhost:9000",  # For MinIO
    "aws_allow_http": True
})

engine.register_table("data", "s3://bucket/delta-table")

Time Travel

# Query a specific version
engine.register_table("sales", "/path/to/table", version=5)

# Get table metadata
info = engine.table_info("/path/to/table")
print(f"Version: {info['version']}, Files: {info['num_files']}")

Development

# Build for development
maturin develop

# Run Rust tests
cargo test --no-default-features

# Run Python tests
pytest

# Check and format
cargo check
cargo clippy
cargo fmt

Architecture

src/
├── lib.rs       # Module entry point
├── config.rs    # Storage configuration
├── error.rs     # Error types
├── engine.rs    # DeltaEngine (DataFusion integration)
└── python.rs    # PyO3 bindings

python/delta_fusion/
├── __init__.py  # Python API
└── __init__.pyi # Type stubs

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

delta_fusion-0.1.0.tar.gz (73.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

delta_fusion-0.1.0-cp312-cp312-manylinux_2_35_x86_64.whl (25.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.35+ x86-64

delta_fusion-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (22.8 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

delta_fusion-0.1.0-cp311-cp311-manylinux_2_35_x86_64.whl (25.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.35+ x86-64

delta_fusion-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (22.8 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file delta_fusion-0.1.0.tar.gz.

File metadata

  • Download URL: delta_fusion-0.1.0.tar.gz
  • Upload date:
  • Size: 73.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for delta_fusion-0.1.0.tar.gz
Algorithm Hash digest
SHA256 35d8d9f9ed5ff442179b51f159891c52cfeb08b3cbd143830d533f646cb3495d
MD5 2485e72e0b9cf2d36982a8b49060ab49
BLAKE2b-256 9e16b2a46c86ed201a78c0a023bb4c18637b8451ad9bf8311db8bb62ee02785f

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-0.1.0.tar.gz:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-0.1.0-cp312-cp312-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for delta_fusion-0.1.0-cp312-cp312-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 5ac61dbd281f15ac5b7d7717074b95a4a9f449cb4e7b00ca69136ef31dccee56
MD5 4a7c76c30956187374c80b80147bf7d5
BLAKE2b-256 fca74438c9e0116329f64c6ed138bd3393afee4e64c5aba8b4b40b0b5369209c

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-0.1.0-cp312-cp312-manylinux_2_35_x86_64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for delta_fusion-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0d6382b8c1953a8895d8626bd83eb0056b7becff66286f80b9cb2c753e9bdcf5
MD5 b1ff3b00d375d5eda34f859120ef1548
BLAKE2b-256 ce1a16fb59a8a08fa82355cb4a5092ce778148dd64c3412ba964ac259ffe30ae

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-0.1.0-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-0.1.0-cp311-cp311-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for delta_fusion-0.1.0-cp311-cp311-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 c92fbf9211f6bef7a2cbd708a110048b7cc1671106066aae92c77f90017d800d
MD5 8fc536767447eb810207bb270e93d553
BLAKE2b-256 81a0d6b1754098eb56ca1be7e5b7a06272c26a90907642c7e540e953390db44a

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-0.1.0-cp311-cp311-manylinux_2_35_x86_64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file delta_fusion-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for delta_fusion-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9065f43b66f4b6575df001b3e0ff8781d0d41da8ebebd89ebe117f8eb2406dc6
MD5 2283893e7e6d87cd13fb8d67e71ef5ee
BLAKE2b-256 79c4f1d28f4c2e1ca7ebda2fd190d394e69659f74f960d0fa13632ec1fd90f39

See more details on using hashes here.

Provenance

The following attestation bundles were made for delta_fusion-0.1.0-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on ingkle-oss/deltaFusion

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page