Skip to main content

Python SDK for the Dataerai transfer daemon

Project description

Dataerai Python SDK

Python client for the Dataerai transfer daemon.
Supports authenticated uploads, downloads, and metadata operations. The daemon is started automatically if it is not already running.

Requirements

  • Python ≥ 3.10
  • The dataerai binary must be installed and on PATH (or pass binary_path explicitly)
  • The user must be logged in via dataerai auth login before calling auth_status() / upload() / download()

Installation

pip install dataerai-sdk
# or, from source:
pip install -e sdk/python/

Quick start

from dataerai import DataeraiClient

with DataeraiClient(binary_path="/usr/local/bin/dataerai") as client:
    # Check auth
    status = client.auth_status()
    print(f"Logged in as {status.user_email}, token expires {status.expires_at}")

    # Create a project to own your data (also provisions its root collection)
    project = client.create_project("My dataset project", description="Demo")
    print(f"Created project {project.project_id}")

    # Upload a file into it
    result = client.upload(
        "/path/to/data.csv",
        title="My dataset",
        owner_type="project",
        owner_id=project.project_id,
        on_progress=lambda p: print(f"  {p.percent:.0f}%  {p.rate_mbps:.1f} MB/s"),
    )
    print(f"Uploaded  asset_id={result.asset_id}  content_id={result.content_id}")

    # Download it back
    dl = client.download(result.asset_id, dest_dir="/tmp/downloads")
    for f in dl.files:
        print(f"  {f.local_path}  ({f.size:,} bytes)")

    # Read / update metadata
    meta = client.get_metadata(result.asset_id)
    updated = client.set_metadata(result.asset_id, title="My dataset v2", tags=["csv", "demo"])

    # Link a derived result to the source asset that produced it
    relationship = client.create_relationship(
        result.asset_id,
        "source-asset-id",
        relationship_type="derived_from",
        analysis_mode="non_destructive",
        qualifiers={"tool": "pycroscopy"},
    )
    related_id = relationship.related_asset["id"] if relationship.related_asset else "source-asset-id"
    print(f"Linked via {relationship.type} to {related_id}")

API reference

DataeraiClient(*, socket_path, binary_path, auto_start, start_timeout_s, request_timeout_s)

Parameter Default Description
socket_path $DATAERAI_SOCKET or /run/user/<uid>/dataerai-transfer.sock Unix socket path
binary_path None Path to the dataerai binary (required for auto_start)
auto_start True Spawn the daemon if the socket does not exist
start_timeout_s 10.0 Seconds to wait for the daemon socket to appear
request_timeout_s 30.0 Per-request timeout in seconds

Use as a context manager (with DataeraiClient(...) as client:) for automatic cleanup, or call client.connect() / client.close() manually.

Methods

Method Returns Description
auth_status() AuthStatus Logged-in user and token expiry
create_project(name, *, description="") Project Create a project (+ root collection, owner membership, default allocation); requires write scope
upload(local_path, *, title, owner_type, owner_id, ...) UploadResult Upload a file; blocks until complete
download(asset_id, dest_dir, ...) DownloadResult Download latest asset content; blocks until complete
get_metadata(asset_id) AssetMetadata Retrieve asset metadata
set_metadata(asset_id, **fields) AssetMetadata Update metadata fields
create_relationship(from_asset_id, to_asset_id, rel_type=None, *, relationship_type=None, ...) Relationship Create a directed provenance link between two assets

create_relationship() arguments

Authors a directed provenance edge from_asset_idto_asset_id. You need write access to the source and read access to the target. rel_type is a free-form verb describing the source's role, e.g. "analysis_of" or "acquired_with". Notebook integrations can pass the same value with the keyword-only alias relationship_type.

Argument Type Description
from_asset_id str Source (dependent) asset — the edge starts here (required)
to_asset_id str Target (origin) asset — the edge points here (required)
rel_type str | None Free-form relationship type, ≤255 chars (required unless relationship_type is provided)
relationship_type str | None Keyword-only alias for rel_type
analysis_mode str | None non_destructive, altering, destructive, in_situ, ex_situ, invasive, non_invasive
qualifier_note str | None Free-text note on the relationship
qualifier_time str | None ISO-8601 timestamp
qualifiers dict | None JSON-serializable extra qualifiers
# Link a processed result back to the raw data it came from.
client.create_relationship(analysis.asset_id, raw.asset_id, "analysis_of",
                           analysis_mode="non_destructive")

upload() keyword arguments

Argument Type Description
title str Asset title (required)
owner_type str "project" or "user" (required)
owner_id str Owner entity ID (required)
description str | None Free-text description
alias str | None Short identifier
tags list[str] | None Tag list
metadata dict | None Arbitrary key-value metadata
collection_id str | None Collection to add the asset to
chunk_size_mb int | None Override default 64 MiB chunk size
on_progress Callable[[ProgressEvent], None] | None Progress callback
transfer_timeout_s float Max seconds to wait for completion (default 3600)

Progress events

@dataclass
class ProgressEvent:
    transfer_id: str
    bytes_done: int
    bytes_total: int
    chunk_index: int
    chunk_count: int
    rate_bps: float
    file_index: int
    file_name: str

    @property
    def percent(self) -> float: ...   # 0–100

    @property
    def rate_mbps(self) -> float: ...

Error types

Exception When raised
DaemonError(code, message) Daemon returned a coded error (see code attribute)
DaemonTimeoutError Request or transfer exceeded the configured timeout
ConnectionError Daemon disconnected unexpectedly

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataerai_sdk-0.2.0b6.tar.gz (243.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataerai_sdk-0.2.0b6-py3-none-any.whl (278.8 kB view details)

Uploaded Python 3

File details

Details for the file dataerai_sdk-0.2.0b6.tar.gz.

File metadata

  • Download URL: dataerai_sdk-0.2.0b6.tar.gz
  • Upload date:
  • Size: 243.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dataerai_sdk-0.2.0b6.tar.gz
Algorithm Hash digest
SHA256 bd7bacda23d76a7f8f17ca6fa079af1d6ce3a03a81d61609dafd2e600259bf94
MD5 9bd2b6f9af6d42419fc37895d8aff828
BLAKE2b-256 6155bf522214e93c1bcce2d0871d7a7eb11ff7635db196aeeb89d54e9f6f79fe

See more details on using hashes here.

File details

Details for the file dataerai_sdk-0.2.0b6-py3-none-any.whl.

File metadata

  • Download URL: dataerai_sdk-0.2.0b6-py3-none-any.whl
  • Upload date:
  • Size: 278.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dataerai_sdk-0.2.0b6-py3-none-any.whl
Algorithm Hash digest
SHA256 518d2041233de3d7c99f0e24b399d8060c7f084a7304cafa84f7a8315ce08d45
MD5 cc4f6e58c50327902d1e07397882a936
BLAKE2b-256 448a5974f677330da010cde7bc2513a963885002629c798e2f39f2917c89c30e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page