Python Pachyderm Client
Project description
Pachyderm's Python SDK
Official Python client/SDK for Pachyderm. The successor to https://github.com/pachyderm/python-pachyderm.
This library provides the autogenerated gRPC/protobuf code for Pachyderm, generated using a fork of the betterproto package, along with higher-level functionality.
Installation
pip install pachyderm_sdk
A Small Taste
Here's an example that creates a repo and adds a file:
from pachyderm_sdk import Client
from pachyderm_sdk.api import pfs
# Connects to a pachyderm cluster using your local config
# at ~/.pachyderm/config.json
client = Client.from_config()
# Creates a pachyderm repo called `test`
repo = pfs.Repo(name="test")
client.pfs.create_repo(repo=repo)
# Create a new commit in `test@master` and upload a file.
branch = pfs.Branch.from_uri("test@master")
with client.pfs.commit(branch=branch) as commit:
file = commit.put_file_from_bytes(path="/data/file.dat", data=b"DATA")
# Retrieve the uploaded file.
with client.pfs.pfs_file(file) as f:
print(f.readall())
How to load a CAST file into a pandas dataframe
from pachyderm_sdk import Client
from pachyderm_sdk.api import pfs
import pandas as pd
client = Client.from_config()
file = pfs.File.from_uri("test@master:/path/to/data.csv")
with client.pfs.pfs_file(file) as f:
df = pd.read_csv(f)
Changes from Python-Pachyderm
This package is a successor to the python-pachyderm package. Listed below are some of the notable changes:
- Organization of the API
- Methods and Message objects are now organized according to the service they are associated with, i.e. auth, pfs (pachyderm file-system), pps (pachyderm pipelining-system).
- Message objects can be found within their respective submodule of the
pachyder_sdk.api
module, i.e.pachyderm_sdk.api.pfs
. - Methods can be found within their respective attribute of the
Client
class, i.e.client.pps.create_pipeline
.- Some methods have been renamed to remove redundancy due to this organization, i.e.
python_pachyderm.Client.get_enterprise_state
->pachyderm_sdk.Client.enterprise.get_state
- Some methods have been renamed to remove redundancy due to this organization, i.e.
- The autogenerated code is generated using a fork of the betterproto compiler.
- Messages are now python dataclasses.
- Methods require keyword arguments.
- Pachyderm resources are specified using types.
- python-pachyderm (old):
client.create_repo("test")
- pachyderm_sdk (new):
client.pfs.create_repo(repo=pfs.Repo(name="test"))
- python-pachyderm (old):
Contributing
Please see the contributing guide for more info (including testing instructions)
Developer Guide
Generate python APIs from protobuf:
./generate-protos.sh
Generate HTML documentation (writes to docs/pachyderm_sdk):
make docs
Running Tests:
pytest -vvv tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pachyderm_sdk-2.9.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 56d90dab2035c5bb6e85f2b7ae1e6eaade156d8ea6b2f9e4e0689333d67c9b1f |
|
MD5 | aa779920e6db2ad10137bdce22642ab8 |
|
BLAKE2b-256 | 7836c4240f0f3c829425913b8ce20dd28017dcfe1ff4b02a4674ee2a4b421dfb |