Skip to main content

Python SDK for the Dataerai transfer daemon

Project description

Dataerai Python SDK

Python client for the Dataerai transfer daemon.
Supports authenticated uploads, downloads, and metadata operations. The daemon is started automatically if it is not already running.

Requirements

  • Python ≥ 3.10
  • The dataerai binary must be installed and on PATH (or pass binary_path explicitly)
  • The user must be logged in via dataerai auth login before calling auth_status() / upload() / download()

Installation

pip install dataerai-sdk
# or, from source:
pip install -e sdk/python/

Quick start

from dataerai import DataeraiClient

with DataeraiClient(binary_path="/usr/local/bin/dataerai") as client:
    # Check auth
    status = client.auth_status()
    print(f"Logged in as {status.user_email}, token expires {status.expires_at}")

    # Create a project to own your data (also provisions its root collection)
    project = client.create_project("My dataset project", description="Demo")
    print(f"Created project {project.project_id}")

    # Upload a file into it
    result = client.upload(
        "/path/to/data.csv",
        title="My dataset",
        owner_type="project",
        owner_id=project.project_id,
        on_progress=lambda p: print(f"  {p.percent:.0f}%  {p.rate_mbps:.1f} MB/s"),
    )
    print(f"Uploaded  asset_id={result.asset_id}  content_id={result.content_id}")

    # Download it back
    dl = client.download(result.asset_id, dest_dir="/tmp/downloads")
    for f in dl.files:
        print(f"  {f.local_path}  ({f.size:,} bytes)")

    # Read / update metadata
    meta = client.get_metadata(result.asset_id)
    updated = client.set_metadata(result.asset_id, title="My dataset v2", tags=["csv", "demo"])

    # Link a derived result to the source asset that produced it
    relationship = client.create_relationship(
        result.asset_id,
        "source-asset-id",
        relationship_type="derived_from",
        analysis_mode="non_destructive",
        qualifiers={"tool": "pycroscopy"},
    )
    related_id = relationship.related_asset["id"] if relationship.related_asset else "source-asset-id"
    print(f"Linked via {relationship.type} to {related_id}")

API reference

DataeraiClient(*, socket_path, binary_path, auto_start, start_timeout_s, request_timeout_s)

Parameter Default Description
socket_path $DATAERAI_SOCKET or /run/user/<uid>/dataerai-transfer.sock Unix socket path
binary_path None Path to the dataerai binary (required for auto_start)
auto_start True Spawn the daemon if the socket does not exist
start_timeout_s 10.0 Seconds to wait for the daemon socket to appear
request_timeout_s 30.0 Per-request timeout in seconds

Use as a context manager (with DataeraiClient(...) as client:) for automatic cleanup, or call client.connect() / client.close() manually.

Methods

Method Returns Description
auth_status() AuthStatus Logged-in user and token expiry
create_project(name, *, description="") Project Create a project (+ root collection, owner membership, default allocation); requires write scope
upload(local_path, *, title, owner_type, owner_id, ...) UploadResult Upload a file; blocks until complete
download(asset_id, dest_dir, ...) DownloadResult Download latest asset content; blocks until complete
get_metadata(asset_id) AssetMetadata Retrieve asset metadata
set_metadata(asset_id, **fields) AssetMetadata Update metadata fields
create_relationship(from_asset_id, to_asset_id, rel_type=None, *, relationship_type=None, ...) Relationship Create a directed provenance link between two assets

create_relationship() arguments

Authors a directed provenance edge from_asset_idto_asset_id. You need write access to the source and read access to the target. rel_type is a free-form verb describing the source's role, e.g. "analysis_of" or "acquired_with". Notebook integrations can pass the same value with the keyword-only alias relationship_type.

Argument Type Description
from_asset_id str Source (dependent) asset — the edge starts here (required)
to_asset_id str Target (origin) asset — the edge points here (required)
rel_type str | None Free-form relationship type, ≤255 chars (required unless relationship_type is provided)
relationship_type str | None Keyword-only alias for rel_type
analysis_mode str | None non_destructive, altering, destructive, in_situ, ex_situ, invasive, non_invasive
qualifier_note str | None Free-text note on the relationship
qualifier_time str | None ISO-8601 timestamp
qualifiers dict | None JSON-serializable extra qualifiers
# Link a processed result back to the raw data it came from.
client.create_relationship(analysis.asset_id, raw.asset_id, "analysis_of",
                           analysis_mode="non_destructive")

upload() keyword arguments

Argument Type Description
title str Asset title (required)
owner_type str "project" or "user" (required)
owner_id str Owner entity ID (required)
description str | None Free-text description
alias str | None Short identifier
tags list[str] | None Tag list
metadata dict | None Arbitrary key-value metadata
collection_id str | None Collection to add the asset to
chunk_size_mb int | None Override default 64 MiB chunk size
on_progress Callable[[ProgressEvent], None] | None Progress callback
transfer_timeout_s float Max seconds to wait for completion (default 3600)

Progress events

@dataclass
class ProgressEvent:
    transfer_id: str
    bytes_done: int
    bytes_total: int
    chunk_index: int
    chunk_count: int
    rate_bps: float
    file_index: int
    file_name: str

    @property
    def percent(self) -> float: ...   # 0–100

    @property
    def rate_mbps(self) -> float: ...

Error types

Exception When raised
DaemonError(code, message) Daemon returned a coded error (see code attribute)
DaemonTimeoutError Request or transfer exceeded the configured timeout
ConnectionError Daemon disconnected unexpectedly

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataerai_sdk-0.2.0b5.tar.gz (243.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataerai_sdk-0.2.0b5-py3-none-any.whl (278.6 kB view details)

Uploaded Python 3

File details

Details for the file dataerai_sdk-0.2.0b5.tar.gz.

File metadata

  • Download URL: dataerai_sdk-0.2.0b5.tar.gz
  • Upload date:
  • Size: 243.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dataerai_sdk-0.2.0b5.tar.gz
Algorithm Hash digest
SHA256 63d5b0e85406e91feef00e7852731280e42ce6711e6e1bd1748bd12d79d3d226
MD5 9cd0a7f3f5061993e9607984f2e3e347
BLAKE2b-256 c03139ba6ae5dc682d9e14f426966fdaac07494be9a1169069e7e9b53d8a00ce

See more details on using hashes here.

File details

Details for the file dataerai_sdk-0.2.0b5-py3-none-any.whl.

File metadata

  • Download URL: dataerai_sdk-0.2.0b5-py3-none-any.whl
  • Upload date:
  • Size: 278.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dataerai_sdk-0.2.0b5-py3-none-any.whl
Algorithm Hash digest
SHA256 93ed69837d758440e3469eef01aa25a30a5b1e99f0038e859189b11da220f8af
MD5 87f086b45cdbf286b0aee2fbcd91806f
BLAKE2b-256 7a861463981e52d56dc9b01ab949c5dcbc91731b89efa9b7bf19324510bfab74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page