A Python datalake client.
Project description
Ness
A Python datalake client.
Requirements
Installation
pip install pyarrow ness
Quickstart
import ness
dl = ness.dl(bucket="mybucket", key="mydatalake")
df = dl.read("mytable")
Sync
# Sync all tables
dl.sync()
# Sync a single table
dl.sync("mytable")
# Sync and read a single table
df = dl.read("mytable", sync=True)
Format
Specify the input data source format, the default format is parquet
:
import ness
dl = ness.dl(bucket="mybucket", key="mydatalake", format="csv")
AWS Profile
Files are synced using default
AWS profile, you can configure another one:
import ness
dl = ness.dl(bucket="mybucket", key="mydatalake", profile="myprofile")
Command Line
Usage: ness sync [OPTIONS] S3_URI
Options:
--format TEXT Data lake source format.
--profile TEXT AWS profile.
--table TEXT Table name to sync.
--help Show this message and exit.
ness sync bucket/key --table mytable
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ness-0.1.5.tar.gz
(4.0 kB
view hashes)
Built Distribution
ness-0.1.5-py3-none-any.whl
(5.8 kB
view hashes)