Skip to main content

Provides a pyiceberg.io.FileIO implementation for hdfs-native client

Project description

pyiceberg-hdfs-native

Provides a pyiceberg.io.FileIO implementation that uses hdfs-native client.

How to use

Install with uv:

uv tool install --with pyiceberg-hdfs-native pyiceberg

Configure pyiceberg via ~/.pyiceberg.yaml:

  default:
    uri: https://iceberg.example.com/
    py-io-impl: pyiceberg_hdfs_native.HdfsFileIO

Configure hdfs-native:

export HADOOP_CONF_DIR=/opt/hadoop/conf

If using kerberos, run kinit.

Now files command should work:

pyiceberg files db.table

Read iceberg table with polars

uv run --with polars --with pyarrow --with pyiceberg-hdfs-native python
from pyiceberg.catalog import load_catalog
import polars as pl

def read_table(table_name):
    catalog = load_catalog(name='default')  # will read config from ~/.pyiceberg.yaml
    table = catalog.load_table(table_name)
    metadata_location = table.metadata_location
    storage_options = {'py-io-impl': 'pyiceberg_hdfs_native.HdfsFileIO'}
    return pl.scan_iceberg(metadata_location, storage_options=storage_options, reader_override='pyiceberg')

read_table('db.tbl').head().collect()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyiceberg_hdfs_native-0.2.0.tar.gz (2.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyiceberg_hdfs_native-0.2.0-py3-none-any.whl (3.1 kB view details)

Uploaded Python 3

File details

Details for the file pyiceberg_hdfs_native-0.2.0.tar.gz.

File metadata

  • Download URL: pyiceberg_hdfs_native-0.2.0.tar.gz
  • Upload date:
  • Size: 2.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pyiceberg_hdfs_native-0.2.0.tar.gz
Algorithm Hash digest
SHA256 aa35e16b65205835a556da137856f46ab86eddf3bd420d99bc1bedbd28cd958d
MD5 dda95c0d455e5735e356758f7304b70e
BLAKE2b-256 795716e72a8d4c839c4522d5c66825725e61b143bbb60a993bc8a4e77b4a834e

See more details on using hashes here.

File details

Details for the file pyiceberg_hdfs_native-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pyiceberg_hdfs_native-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f6170984cf6f945d641e5d36fa6e7061696c98f2ecc5e2da717a9863157d46db
MD5 1209c3ba255bab1d44c752c6a0105c3b
BLAKE2b-256 98192bc16d310c99e63375e92ba84fc73274c628c982280aaa865177099bc20c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page