Skip to main content

Python binding for delta-rs

Reason this release was yanked:

broken linux build

Project description

Deltalake-python

PyPI

Native Delta Lake binding for Python based on delta-rs.

Installation

pip install deltalake

NOTE: official binary wheels are linked against openssl statically for remote objection store communication. Please file Github issue to request for critical openssl upgrade.

Usage

Resolve partitions for current version of the DeltaTable:

>>> from deltalake import DeltaTable
>>> dt = DeltaTable("../rust/tests/data/delta-0.2.0")
>>> dt.version()
3
>>> dt.files()
['part-00000-cb6b150b-30b8-4662-ad28-ff32ddab96d2-c000.snappy.parquet', 'part-00000-7c2deba3-1994-4fb8-bc07-d46c948aa415-c000.snappy.parquet', 'part-00001-c373a5bd-85f0-4758-815e-7eb62007a15c-c000.snappy.parquet']

Convert DeltaTable into PyArrow Table and Pandas Dataframe:

>>> from deltalake import DeltaTable
>>> dt = DeltaTable("../rust/tests/data/simple_table")
>>> df = dt.to_pyarrow_table().to_pandas()
>>> df
   id
0   5
1   7
2   9
>>> df[df['id'] > 5]
   id
1   7
2   9

Time travel:

>>> from deltalake import DeltaTable
>>> dt = DeltaTable("../rust/tests/data/simple_table")
>>> dt.load_version(2)
>>> dt.to_pyarrow_table().to_pandas()
   id
0   5
1   7
2   9
3   5
4   6
5   7
6   8
7   9

Schema:

>>> from deltalake import DeltaTable
>>> dt = DeltaTable("../rust/tests/data/simple_table")
>>> dt.schema()
Schema(Field(id: DataType(long) nullable(True) metadata({})))
>>> dt.pyarrow_schema()
id: int64

Develop

maturin is used to build the python package.

To install development version of the package into your current Python environment:

$ maturin develop

Code are formatted with https://github.com/psf/black.

Build manylinux wheels

docker run -e PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig -it -v `pwd`:/io apache/arrow-dev:amd64-centos-6.10-python-manylinux2010 bash
curl https://sh.rustup.rs -sSf | sh -s -- -y
source $HOME/.cargo/env
rustup default stable
cargo install --git https://github.com/PyO3/maturin.git --rev 98636cea89c328b3eba4ebb548124f75c8018200 maturin
cd /io/python
export PATH=/opt/python/cp37-cp37m/bin:/opt/python/cp38-cp38/bin:$PATH
maturin publish -b pyo3 --target x86_64-unknown-linux-gnu --no-sdist

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

deltalake-0.4.2-cp36-abi3-manylinux2010_x86_64.whl (7.1 MB view details)

Uploaded CPython 3.6+manylinux: glibc 2.12+ x86-64

deltalake-0.4.2-cp36-abi3-macosx_10_7_x86_64.whl (5.5 MB view details)

Uploaded CPython 3.6+macOS 10.7+ x86-64

File details

Details for the file deltalake-0.4.2-cp36-abi3-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for deltalake-0.4.2-cp36-abi3-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c775ce9b2f3ebdfd3d38806b2b5d84d38e5f80da1cd5f67fff9a13d745225d7a
MD5 34dbaafbabee2c30c4b7a4da1b09d80c
BLAKE2b-256 d1640120b51101d9bc4d63babbebca0dce2f0ab6224703b86c73a52d4496227c

See more details on using hashes here.

File details

Details for the file deltalake-0.4.2-cp36-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for deltalake-0.4.2-cp36-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 f9f06c9c0e9a35ee3f6d928e60fdd0d03b8871794fe40a6d2a44a98fcbab7b05
MD5 2847d8ef140a3a587d1678d5b7c75696
BLAKE2b-256 9bc6a47325eb043569d7a688f311f3d92f13356264a97ba6e97e995062b3cdca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page