Skip to main content

Python binding for delta-rs

Project description

Deltalake-python

PyPI

Native Delta Lake binding for Python based on delta-rs.

Installation

pip install deltalake

NOTE: official binary wheels are linked against openssl statically for remote objection store communication. Please file Github issue to request for critical openssl upgrade.

Usage

Resolve partitions for current version of the DeltaTable:

>>> from deltalake import DeltaTable
>>> dt = DeltaTable("../rust/tests/data/delta-0.2.0")
>>> dt.version()
3
>>> dt.files()
['part-00000-cb6b150b-30b8-4662-ad28-ff32ddab96d2-c000.snappy.parquet', 'part-00000-7c2deba3-1994-4fb8-bc07-d46c948aa415-c000.snappy.parquet', 'part-00001-c373a5bd-85f0-4758-815e-7eb62007a15c-c000.snappy.parquet']

Convert DeltaTable into PyArrow Table and Pandas Dataframe:

>>> from deltalake import DeltaTable
>>> dt = DeltaTable("../rust/tests/data/simple_table")
>>> df = dt.to_pyarrow_table().to_pandas()
>>> df
   id
0   5
1   7
2   9
>>> df[df['id'] > 5]
   id
1   7
2   9

Time travel:

>>> from deltalake import DeltaTable
>>> dt = DeltaTable("../rust/tests/data/simple_table")
>>> dt.load_version(2)
>>> dt.to_pyarrow_table().to_pandas()
   id
0   5
1   7
2   9
3   5
4   6
5   7
6   8
7   9

Schema:

>>> from deltalake import DeltaTable
>>> dt = DeltaTable("../rust/tests/data/simple_table")
>>> dt.schema()
Schema(Field(id: DataType(long) nullable(True) metadata({})))
>>> dt.pyarrow_schema()
id: int64

Develop

maturin is used to build the python package.

To install development version of the package into your current Python environment:

$ maturin develop

Code are formatted with https://github.com/psf/black.

Build manylinux wheels

docker run -e PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig -it -v `pwd`:/io apache/arrow-dev:amd64-centos-6.10-python-manylinux2010 bash
curl https://sh.rustup.rs -sSf | sh -s -- -y
source $HOME/.cargo/env
rustup default stable
cargo install --git https://github.com/PyO3/maturin.git --rev 98636cea89c328b3eba4ebb548124f75c8018200 maturin
cd /io/python
export PATH=/opt/python/cp37-cp37m/bin:/opt/python/cp38-cp38/bin:$PATH
maturin publish -b pyo3 --target x86_64-unknown-linux-gnu --no-sdist

Project details


Release history Release notifications | RSS feed

This version

0.3.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

deltalake-0.3.0-cp36-abi3-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.6+Windows x86-64

deltalake-0.3.0-cp36-abi3-manylinux2010_x86_64.whl (6.6 MB view details)

Uploaded CPython 3.6+manylinux: glibc 2.12+ x86-64

deltalake-0.3.0-cp36-abi3-macosx_10_7_x86_64.whl (5.0 MB view details)

Uploaded CPython 3.6+macOS 10.7+ x86-64

File details

Details for the file deltalake-0.3.0-cp36-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for deltalake-0.3.0-cp36-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 c1f35439044289e6a00b70d2b7a84df0fb429dcba2d675695cf9a8337dd50308
MD5 4b5a7d346dff97bf6eb620d9d0727c11
BLAKE2b-256 442fce88e0e4b6b486684b5c0bed7f9043db2035ce5d89144375311c5873dfcd

See more details on using hashes here.

File details

Details for the file deltalake-0.3.0-cp36-abi3-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for deltalake-0.3.0-cp36-abi3-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 3a473bd004bd711c0cb8be04419df0ab83e55d5c3f25e7fd15ada4a4020c819c
MD5 1f3fc9f4dcb0695f960a02d808e5030b
BLAKE2b-256 d2a9e057c8d3b228e18194e970df3d4593c732be179478e572d39c70f08d1eb0

See more details on using hashes here.

File details

Details for the file deltalake-0.3.0-cp36-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for deltalake-0.3.0-cp36-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 b9ffa28abf3c533e24eb31c963e9e3ade23758761def8c551fe45f680839052b
MD5 e508fe43cd22492204062832a3ccc05c
BLAKE2b-256 edaa03a303ea94c00191f122de66c97ec333e23f61f9a1b63b2d7b90b469c3c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page