Skip to main content

Native Delta Lake Python binding based on delta-rs with Pandas integration

Project description

Hops-deltalake-python

PyPI userdoc apidoc

A HopsFS supported version of native Delta Lake Python binding based on delta-rs with Pandas integration.

Example

from deltalake import DeltaTable
dt = DeltaTable("../rust/tests/data/delta-0.2.0")
dt.version()
3
dt.file_uris()
['s3://bucket/table/part-00000-cb6b150b-30b8-4662-ad28-ff32ddab96d2-c000.snappy.parquet',
 's3://bucket/table/part-00000-7c2deba3-1994-4fb8-bc07-d46c948aa415-c000.snappy.parquet',
 's3://bucket/table/part-00001-c373a5bd-85f0-4758-815e-7eb62007a15c-c000.snappy.parquet']

See the user guide for more examples.

Compatibility Matrix

hops-deltalake hops-object-store HopsFS
1.1.2-post2 1.0.3 3.2.0.18
1.4.0 1.1.0 3.2.0.18-EE-RC1
1.4.0-post1 1.1.1 >= 3.2.0.18-EE-RC1 < 3.4.4.0-EE-RC0

Installation

# with pip
pip install hops-deltalake

NOTE: official binary wheels are linked against openssl statically for remote objection store communication. Please file Github issue to request for critical openssl upgrade.

Tracing and Observability

Delta-rs supports OpenTelemetry tracing for performance analysis and debugging.

Basic Example

import os
import deltalake
from deltalake import write_deltalake, DeltaTable

# Enable logging to see trace output in stdout
os.environ["RUST_LOG"] = "deltalake=debug"

# Initialize tracing (uses default HTTP endpoint or OTEL_EXPORTER_OTLP_ENDPOINT env var)
# For authentication, set OTEL_EXPORTER_OTLP_HEADERS="x-honeycomb-team=your-api-key"
# The HTTP exporter automatically reads OTEL_EXPORTER_OTLP_HEADERS for API keys
deltalake.init_tracing()

# All Delta operations are now traced
write_deltalake("my_table", data)
dt = DeltaTable("my_table")
df = dt.to_pandas()

When you run this code, you'll see trace information in stdout showing operation timings and execution flow.

Build custom wheels

Sometimes you may wish to build custom wheels. Maybe you want to try out some unreleased features. Or maybe you want to tweak the optimization of the Rust code.

To compile the package, you will need the Rust compiler and maturin:

curl https://sh.rustup.rs -sSf | sh -s

Then you can build wheels for your own platform like so:

```sh
uvx maturin build --release --out wheels

Note:

  • uvx invokes a tool without installing it.
  • if you plan to often use maturin, you can install the "tool" with uv tool install maturin.

For a build that is optimized for the system you are on (but sacrificing portability):

RUSTFLAGS="-C target-cpu=native" uvx maturin build --release --out wheels

Cross compilation

The above command only works for your current platform. To create wheels for other platforms, you'll need to cross compile. Cross compilation requires installing two additional components: to cross compile Rust code, you will need to install the target with rustup; to cross compile the Python bindings, you will need to install ziglang.

The following example is for manylinux2014. Other targets will require different Rust target and Python compatibility tags.

rustup target add x86_64-unknown-linux-gnu

Then you can build wheels for the target platform like so:

uvx --from 'maturin[zig]' maturin build --release --zig \
    --target x86_64-unknown-linux-gnu \
    --compatibility manylinux2014 \
    --out wheels

If you expect to only run on more modern system, you can set a newer target-cpu flag to Rust and use a newer compatibility tag for Linux. For example, here we set compatibility with CPUs newer than Haswell (2013) and Linux OS with glibc version of at least 2.24:

RUSTFLAGS="-C target-cpu=haswell" uvx --from 'maturin[zig]' maturin build --release --zig \
    --target x86_64-unknown-linux-gnu \
    --compatibility manylinux_2_24 \
    --out wheels

See note about RUSTFLAGS from the arrow-rs readme.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hops_deltalake-1.4.0.post1.tar.gz (5.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hops_deltalake-1.4.0.post1-cp310-abi3-manylinux_2_28_x86_64.whl (48.4 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ x86-64

hops_deltalake-1.4.0.post1-cp310-abi3-macosx_11_0_arm64.whl (39.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file hops_deltalake-1.4.0.post1.tar.gz.

File metadata

  • Download URL: hops_deltalake-1.4.0.post1.tar.gz
  • Upload date:
  • Size: 5.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.6.0

File hashes

Hashes for hops_deltalake-1.4.0.post1.tar.gz
Algorithm Hash digest
SHA256 f67ed600d6a373bfd373011e2e24cacde2d37645fe8d91086802b76b8e7fa6c5
MD5 b1cc40dc78608ee12a6eeeb55579db89
BLAKE2b-256 8ad8665088ddfbd4d52c35931a30916c5bb136dbe5828b08038ce0c732df046f

See more details on using hashes here.

File details

Details for the file hops_deltalake-1.4.0.post1-cp310-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for hops_deltalake-1.4.0.post1-cp310-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1a8d10ab610b33ee98b58d70069018033e42dcf34a06c3713362abe94e94297b
MD5 ad659d3aec381fb22ddf6b5b449e487a
BLAKE2b-256 dded4dad3a22123209d75bc96294f56237920f773b9076a9fc6e254813533717

See more details on using hashes here.

File details

Details for the file hops_deltalake-1.4.0.post1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hops_deltalake-1.4.0.post1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ddf0e1628de6f1f3b4f967e59c84c442ab88e24e322f0d31c52e1b585a3bacc6
MD5 dff537c0f4bc641a9410005d0a566110
BLAKE2b-256 7555b4fecdee6bdbbf6652230182ae85ef92e50468f5724b43a5dfd9a465fb8f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page