Skip to main content

Python SDK for the Dataerai transfer daemon

Project description

Dataerai Python SDK

Python client for the Dataerai transfer daemon.
Supports authenticated uploads, downloads, and metadata operations. The daemon is started automatically if it is not already running.

Requirements

  • Python ≥ 3.10
  • The dataerai binary must be installed and on PATH (or pass binary_path explicitly)
  • The user must be logged in via dataerai auth login before calling auth_status() / upload() / download()

Installation

pip install dataerai-sdk
# or, from source:
pip install -e sdk/python/

Quick start

from dataerai import DataeraiClient

with DataeraiClient(binary_path="/usr/local/bin/dataerai") as client:
    # Check auth
    status = client.auth_status()
    print(f"Logged in as {status.user_email}, token expires {status.expires_at}")

    # Create a project to own your data (also provisions its root collection)
    project = client.create_project("My dataset project", description="Demo")
    print(f"Created project {project.project_id}")

    # Upload a file into it
    result = client.upload(
        "/path/to/data.csv",
        title="My dataset",
        owner_type="project",
        owner_id=project.project_id,
        on_progress=lambda p: print(f"  {p.percent:.0f}%  {p.rate_mbps:.1f} MB/s"),
    )
    print(f"Uploaded  asset_id={result.asset_id}  content_id={result.content_id}")

    # Download it back
    dl = client.download(result.asset_id, dest_dir="/tmp/downloads")
    for f in dl.files:
        print(f"  {f.local_path}  ({f.size:,} bytes)")

    # Read / update metadata
    meta = client.get_metadata(result.asset_id)
    updated = client.set_metadata(result.asset_id, title="My dataset v2", tags=["csv", "demo"])

API reference

DataeraiClient(*, socket_path, binary_path, auto_start, start_timeout_s, request_timeout_s)

Parameter Default Description
socket_path $DATAERAI_SOCKET or /run/user/<uid>/dataerai-transfer.sock Unix socket path
binary_path None Path to the dataerai binary (required for auto_start)
auto_start True Spawn the daemon if the socket does not exist
start_timeout_s 10.0 Seconds to wait for the daemon socket to appear
request_timeout_s 30.0 Per-request timeout in seconds

Use as a context manager (with DataeraiClient(...) as client:) for automatic cleanup, or call client.connect() / client.close() manually.

Methods

Method Returns Description
auth_status() AuthStatus Logged-in user and token expiry
create_project(name, *, description="") Project Create a project (+ root collection, owner membership, default allocation); requires write scope
upload(local_path, *, title, owner_type, owner_id, ...) UploadResult Upload a file; blocks until complete
download(asset_id, dest_dir, ...) DownloadResult Download latest asset content; blocks until complete
get_metadata(asset_id) AssetMetadata Retrieve asset metadata
set_metadata(asset_id, **fields) AssetMetadata Update metadata fields

upload() keyword arguments

Argument Type Description
title str Asset title (required)
owner_type str "project" or "collection" (required)
owner_id str Owner entity ID (required)
description str | None Free-text description
alias str | None Short identifier
tags list[str] | None Tag list
metadata dict | None Arbitrary key-value metadata
collection_id str | None Collection to add the asset to
chunk_size_mb int | None Override default 64 MiB chunk size
on_progress Callable[[ProgressEvent], None] | None Progress callback
transfer_timeout_s float Max seconds to wait for completion (default 3600)

Progress events

@dataclass
class ProgressEvent:
    transfer_id: str
    bytes_done: int
    bytes_total: int
    chunk_index: int
    chunk_count: int
    rate_bps: float
    file_index: int
    file_name: str

    @property
    def percent(self) -> float: ...   # 0–100

    @property
    def rate_mbps(self) -> float: ...

Error types

Exception When raised
DaemonError(code, message) Daemon returned a coded error (see code attribute)
DaemonTimeoutError Request or transfer exceeded the configured timeout
ConnectionError Daemon disconnected unexpectedly

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataerai_sdk-0.2.0b1.tar.gz (107.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataerai_sdk-0.2.0b1-py3-none-any.whl (115.2 kB view details)

Uploaded Python 3

File details

Details for the file dataerai_sdk-0.2.0b1.tar.gz.

File metadata

  • Download URL: dataerai_sdk-0.2.0b1.tar.gz
  • Upload date:
  • Size: 107.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dataerai_sdk-0.2.0b1.tar.gz
Algorithm Hash digest
SHA256 0c094e20d635d3de6d1b3daf02c73f16d2180ab18807678aecaf5dd35f12618f
MD5 6a490e0914b69ff8096c00a8c969d1da
BLAKE2b-256 4102363033e31a884c544579c98be19f7783a2db8c07d07e583316ed646a3f23

See more details on using hashes here.

File details

Details for the file dataerai_sdk-0.2.0b1-py3-none-any.whl.

File metadata

  • Download URL: dataerai_sdk-0.2.0b1-py3-none-any.whl
  • Upload date:
  • Size: 115.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dataerai_sdk-0.2.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 7ae660a317cfcf1b3201cdc3f9c0ef99facf0931b33690c13b4d9bf21a800582
MD5 c8545a879e0df4444b7411e490bb7453
BLAKE2b-256 b89ea15e4101069d9c5d5959ebdcfea338258708845941e47e193004c07c47fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page