Skip to main content

Python SDK for the Dataerai transfer daemon

Project description

Dataerai Python SDK

Python client for the Dataerai transfer daemon.
Supports authenticated uploads, downloads, and metadata operations. The daemon is started automatically if it is not already running.

Requirements

  • Python ≥ 3.10
  • The dataerai binary must be installed and on PATH (or pass binary_path explicitly)
  • The user must be logged in via dataerai auth login before calling auth_status() / upload() / download()

Installation

pip install dataerai-sdk
# or, from source:
pip install -e sdk/python/

Quick start

from dataerai import DataeraiClient

with DataeraiClient(binary_path="/usr/local/bin/dataerai") as client:
    # Check auth
    status = client.auth_status()
    print(f"Logged in as {status.user_email}, token expires {status.expires_at}")

    # Create a project to own your data (also provisions its root collection)
    project = client.create_project("My dataset project", description="Demo")
    print(f"Created project {project.project_id}")

    # Upload a file into it
    result = client.upload(
        "/path/to/data.csv",
        title="My dataset",
        owner_type="project",
        owner_id=project.project_id,
        on_progress=lambda p: print(f"  {p.percent:.0f}%  {p.rate_mbps:.1f} MB/s"),
    )
    print(f"Uploaded  asset_id={result.asset_id}  content_id={result.content_id}")

    # Download it back
    dl = client.download(result.asset_id, dest_dir="/tmp/downloads")
    for f in dl.files:
        print(f"  {f.local_path}  ({f.size:,} bytes)")

    # Read / update metadata
    meta = client.get_metadata(result.asset_id)
    updated = client.set_metadata(result.asset_id, title="My dataset v2", tags=["csv", "demo"])

API reference

DataeraiClient(*, socket_path, binary_path, auto_start, start_timeout_s, request_timeout_s)

Parameter Default Description
socket_path $DATAERAI_SOCKET or /run/user/<uid>/dataerai-transfer.sock Unix socket path
binary_path None Path to the dataerai binary (required for auto_start)
auto_start True Spawn the daemon if the socket does not exist
start_timeout_s 10.0 Seconds to wait for the daemon socket to appear
request_timeout_s 30.0 Per-request timeout in seconds

Use as a context manager (with DataeraiClient(...) as client:) for automatic cleanup, or call client.connect() / client.close() manually.

Methods

Method Returns Description
auth_status() AuthStatus Logged-in user and token expiry
create_project(name, *, description="") Project Create a project (+ root collection, owner membership, default allocation); requires write scope
upload(local_path, *, title, owner_type, owner_id, ...) UploadResult Upload a file; blocks until complete
download(asset_id, dest_dir, ...) DownloadResult Download latest asset content; blocks until complete
get_metadata(asset_id) AssetMetadata Retrieve asset metadata
set_metadata(asset_id, **fields) AssetMetadata Update metadata fields
create_relationship(from_asset_id, to_asset_id, rel_type, *, ...) Relationship Create a directed provenance link between two assets

create_relationship() arguments

Authors a directed provenance edge from_asset_idto_asset_id. You need write access to the source and read access to the target. rel_type is a free-form verb describing the source's role, e.g. "analysis_of" or "acquired_with".

Argument Type Description
from_asset_id str Source (dependent) asset — the edge starts here (required)
to_asset_id str Target (origin) asset — the edge points here (required)
rel_type str Free-form relationship type, ≤255 chars (required)
analysis_mode str | None non_destructive, altering, destructive, in_situ, ex_situ, invasive, non_invasive
qualifier_note str | None Free-text note on the relationship
qualifier_time str | None ISO-8601 timestamp
qualifiers dict | None JSON-serializable extra qualifiers
# Link a processed result back to the raw data it came from.
client.create_relationship(analysis.asset_id, raw.asset_id, "analysis_of",
                           analysis_mode="non_destructive")

upload() keyword arguments

Argument Type Description
title str Asset title (required)
owner_type str "project" or "user" (required)
owner_id str Owner entity ID (required)
description str | None Free-text description
alias str | None Short identifier
tags list[str] | None Tag list
metadata dict | None Arbitrary key-value metadata
collection_id str | None Collection to add the asset to
chunk_size_mb int | None Override default 64 MiB chunk size
on_progress Callable[[ProgressEvent], None] | None Progress callback
transfer_timeout_s float Max seconds to wait for completion (default 3600)

Progress events

@dataclass
class ProgressEvent:
    transfer_id: str
    bytes_done: int
    bytes_total: int
    chunk_index: int
    chunk_count: int
    rate_bps: float
    file_index: int
    file_name: str

    @property
    def percent(self) -> float: ...   # 0–100

    @property
    def rate_mbps(self) -> float: ...

Error types

Exception When raised
DaemonError(code, message) Daemon returned a coded error (see code attribute)
DaemonTimeoutError Request or transfer exceeded the configured timeout
ConnectionError Daemon disconnected unexpectedly

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataerai_sdk-0.2.0b3.tar.gz (242.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataerai_sdk-0.2.0b3-py3-none-any.whl (278.9 kB view details)

Uploaded Python 3

File details

Details for the file dataerai_sdk-0.2.0b3.tar.gz.

File metadata

  • Download URL: dataerai_sdk-0.2.0b3.tar.gz
  • Upload date:
  • Size: 242.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dataerai_sdk-0.2.0b3.tar.gz
Algorithm Hash digest
SHA256 e7512b1df90ac5259f7d571536b2f1333d7318cb5603eac9048a53d10430fa6b
MD5 c7c44cd4b20c2144ce9e41bcd84be85e
BLAKE2b-256 ed387b39434faf3d7fdbed5544f65f1b8004672c655ca20d06383f73addb47d6

See more details on using hashes here.

File details

Details for the file dataerai_sdk-0.2.0b3-py3-none-any.whl.

File metadata

  • Download URL: dataerai_sdk-0.2.0b3-py3-none-any.whl
  • Upload date:
  • Size: 278.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dataerai_sdk-0.2.0b3-py3-none-any.whl
Algorithm Hash digest
SHA256 ecb917adaaecd55f14d133a14d8f5fdb305d9a3c9651539da3d21db73409dcaf
MD5 02feba8a1d94f89c7a2d16a746be90ec
BLAKE2b-256 f58dde6dc64c0658f4c720f6ccb3f4349cfd3af4f7c8b1910f5573e4fce7def5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page