Skip to main content

Native Python binding for Apache Hudi, based on hudi-rs.

Project description

Hudi logo

A native Rust library for Apache Hudi, with bindings to Python

hudi-rs ci hudi-rs codecov join hudi slack follow hudi x/twitter follow hudi linkedin

The hudi-rs project aims to broaden the use of Apache Hudi for a diverse range of users and projects.

Source Installation Command
PyPi pip install hudi
Crates.io cargo add hudi

Example usage

[!NOTE] These examples expect a Hudi table exists at /tmp/trips_table, created using the quick start guide.

Python

Read a Hudi table into a PyArrow table.

from hudi import HudiTable

hudi_table = HudiTable("/tmp/trips_table")
records = hudi_table.read_snapshot()

import pyarrow as pa
import pyarrow.compute as pc

arrow_table = pa.Table.from_batches(records)
result = arrow_table.select(
    ["rider", "ts", "fare"]).filter(
    pc.field("fare") > 20.0)
print(result)

Rust

Add crate hudi with datafusion feature to your application to query a Hudi table.
cargo new my_project --bin && cd my_project
cargo add tokio@1 datafusion@39
cargo add hudi --features datafusion

Update src/main.rs with the code snippet below then cargo run.

use std::sync::Arc;

use datafusion::error::Result;
use datafusion::prelude::{DataFrame, SessionContext};
use hudi::HudiDataSource;

#[tokio::main]
async fn main() -> Result<()> {
    let ctx = SessionContext::new();
    let hudi = HudiDataSource::new("/tmp/trips_table").await?;
    ctx.register_table("trips_table", Arc::new(hudi))?;
    let df: DataFrame = ctx.sql("SELECT * from trips_table where fare > 20.0").await?;
    df.show().await?;
    Ok(())
}

Work with cloud storage

Ensure cloud storage credentials are set properly as environment variables, e.g., AWS_*, AZURE_*, or GOOGLE_*. Relevant storage environment variables will then be picked up. The target table's base uri with schemes such as s3://, az://, or gs:// will be processed accordingly.

Contributing

Check out the contributing guide for all the details about making contributions to the project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hudi-0.2.0rc1.tar.gz (246.6 kB view details)

Uploaded Source

Built Distributions

hudi-0.2.0rc1-cp39-abi3-win_amd64.whl (6.0 MB view details)

Uploaded CPython 3.9+ Windows x86-64

hudi-0.2.0rc1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.2 MB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ x86-64

hudi-0.2.0rc1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.7 MB view details)

Uploaded CPython 3.9+ manylinux: glibc 2.17+ ARM64

hudi-0.2.0rc1-cp39-abi3-macosx_11_0_arm64.whl (5.4 MB view details)

Uploaded CPython 3.9+ macOS 11.0+ ARM64

hudi-0.2.0rc1-cp39-abi3-macosx_10_12_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.9+ macOS 10.12+ x86-64

File details

Details for the file hudi-0.2.0rc1.tar.gz.

File metadata

  • Download URL: hudi-0.2.0rc1.tar.gz
  • Upload date:
  • Size: 246.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.7.4

File hashes

Hashes for hudi-0.2.0rc1.tar.gz
Algorithm Hash digest
SHA256 95e8ea176b6c82bb4af5984d5a63982dd54140d43930451de7f9cc78623445c6
MD5 186f11c96278cba85197984fa14931ef
BLAKE2b-256 958844ba20a8248305214fcb5cfcd60fb589a8aa68a227c564e30f80afa32aa1

See more details on using hashes here.

File details

Details for the file hudi-0.2.0rc1-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for hudi-0.2.0rc1-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 a9e41add126b151e297042479ea60cdaf0eb3419fc46831914763aa82842bbe4
MD5 e5fadd686c370f870fe9766e022cc828
BLAKE2b-256 8bca24b18708a52ecb34240e7574da16fb831be57bf64a227e03cfc253559874

See more details on using hashes here.

File details

Details for the file hudi-0.2.0rc1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hudi-0.2.0rc1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 24efcefe7c4445a8ddb77868cc7e0f017c356995ff29cb6030211e4414cced7c
MD5 00e1ba7c97595af1d733e12ef43c0f52
BLAKE2b-256 2a38b691b575682543a0025f5e6327b32c5680116bbd0ec4562d5137d915641d

See more details on using hashes here.

File details

Details for the file hudi-0.2.0rc1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for hudi-0.2.0rc1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 da46099cdce296c381487021ae2ecd1591b85c56fb08c85e52cfbf8c619620e4
MD5 061829c60f7cdd802d684dc5ec0bdc1b
BLAKE2b-256 bd7dc7988ba0d6731d7ef7912b12a02227bf672a58c686cdfec4255a9a5a55a7

See more details on using hashes here.

File details

Details for the file hudi-0.2.0rc1-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hudi-0.2.0rc1-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9eb5b8ec0bf034c26a97987ea36d54be8d33cefa57a2853efcea548ee079f5a0
MD5 b8b9201597592325f210468e81c815c9
BLAKE2b-256 29fd2d7cae0116c31479fb91a6d7ecccaff5766a65435cc20945824eda02f16e

See more details on using hashes here.

File details

Details for the file hudi-0.2.0rc1-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for hudi-0.2.0rc1-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 14f913f60ba1da134f7efe0bee7d72a0f56e77a85280620e1be5dc8c44fea721
MD5 4cc3df97521d154e20ad0c8b367075ac
BLAKE2b-256 bf4bf33dc84b20fa12994f176256c36fdb2a97f32b64fd8819395595507e9ab1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page