Python Pachyderm Client
Project description
Pachyderm's Python SDK
Official Python client/SDK for Pachyderm. The successor to https://github.com/pachyderm/python-pachyderm.
This library provides the autogenerated gRPC/protobuf code for Pachyderm, generated using a fork of the betterproto package, along with higher-level functionality.
Installation
pip install pachyderm_sdk
A Small Taste
Here's an example that creates a repo and adds a file:
from pachyderm_sdk import Client
from pachyderm_sdk.api import pfs
# Connects to a pachyderm cluster using your local config
# at ~/.pachyderm/config.json
client = Client.from_config()
# Creates a pachyderm repo called `test`
repo = pfs.Repo(name="test")
client.pfs.create_repo(repo=repo)
# Create a new commit in `test@master` and upload a file.
branch = pfs.Branch.from_uri("test@master")
with client.pfs.commit(branch=branch) as commit:
file = commit.put_file_from_bytes(path="/data/file.dat", data=b"DATA")
# Retrieve the uploaded file.
with client.pfs.pfs_file(file) as f:
print(f.readall())
How to load a CAST file into a pandas dataframe
from pachyderm_sdk import Client
from pachyderm_sdk.api import pfs
import pandas as pd
client = Client.from_config()
file = pfs.File.from_uri("test@master:/path/to/data.csv")
with client.pfs.pfs_file(file) as f:
df = pd.read_csv(f)
Changes from Python-Pachyderm
This package is a successor to the python-pachyderm package. Listed below are some of the notable changes:
- Organization of the API
- Methods and Message objects are now organized according to the service they are associated with, i.e. auth, pfs (pachyderm file-system), pps (pachyderm pipelining-system).
- Message objects can be found within their respective submodule of the
pachyder_sdk.api
module, i.e.pachyderm_sdk.api.pfs
. - Methods can be found within their respective attribute of the
Client
class, i.e.client.pps.create_pipeline
.- Some methods have been renamed to remove redundancy due to this organization, i.e.
python_pachyderm.Client.get_enterprise_state
->pachyderm_sdk.Client.enterprise.get_state
- Some methods have been renamed to remove redundancy due to this organization, i.e.
- The autogenerated code is generated using a fork of the betterproto compiler.
- Messages are now python dataclasses.
- Methods require keyword arguments.
- Pachyderm resources are specified using types.
- python-pachyderm (old):
client.create_repo("test")
- pachyderm_sdk (new):
client.pfs.create_repo(repo=pfs.Repo(name="test"))
- python-pachyderm (old):
Contributing
Please see the contributing guide for more info (including testing instructions)
Developer Guide
Generate python APIs from protobuf:
./generate-protos.sh
Running Tests:
pytest -vvv tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pachyderm_sdk-2.7.7.tar.gz
(59.4 kB
view hashes)
Built Distribution
Close
Hashes for pachyderm_sdk-2.7.7-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 383a5fb5d34ee4948ca6bb089ff900b6c91f9f06519fe449f6bc3dbcd140cad5 |
|
MD5 | b9f25bbfd625a1870f62c9b308ab8587 |
|
BLAKE2b-256 | ec1930b50eb8a786c7ecb000b86834188e71300bf2462d05b4abff754d53e5d1 |