Python SDK for the Dataerai transfer daemon
Project description
Dataerai Python SDK
Python client for the Dataerai transfer daemon.
Supports authenticated uploads, downloads, and metadata operations.
The daemon is started automatically if it is not already running.
Requirements
- Python ≥ 3.10
- The
dataeraibinary must be installed and onPATH(or passbinary_pathexplicitly) - The user must be logged in via
dataerai auth loginbefore callingauth_status()/upload()/download()
Installation
pip install dataerai-sdk
# or, from source:
pip install -e sdk/python/
Quick start
from dataerai import DataeraiClient
with DataeraiClient(binary_path="/usr/local/bin/dataerai") as client:
# Check auth
status = client.auth_status()
print(f"Logged in as {status.user_email}, token expires {status.expires_at}")
# Create a project to own your data (also provisions its root collection)
project = client.create_project("My dataset project", description="Demo")
print(f"Created project {project.project_id}")
# Upload a file into it
result = client.upload(
"/path/to/data.csv",
title="My dataset",
owner_type="project",
owner_id=project.project_id,
on_progress=lambda p: print(f" {p.percent:.0f}% {p.rate_mbps:.1f} MB/s"),
)
print(f"Uploaded asset_id={result.asset_id} content_id={result.content_id}")
# Download it back
dl = client.download(result.asset_id, dest_dir="/tmp/downloads")
for f in dl.files:
print(f" {f.local_path} ({f.size:,} bytes)")
# Read / update metadata
meta = client.get_metadata(result.asset_id)
updated = client.set_metadata(result.asset_id, title="My dataset v2", tags=["csv", "demo"])
# Link a derived result to the source asset that produced it
relationship = client.create_relationship(
result.asset_id,
"source-asset-id",
relationship_type="derived_from",
analysis_mode="non_destructive",
qualifiers={"tool": "pycroscopy"},
)
related_id = relationship.related_asset["id"] if relationship.related_asset else "source-asset-id"
print(f"Linked via {relationship.type} to {related_id}")
API reference
DataeraiClient(*, socket_path, binary_path, auto_start, start_timeout_s, request_timeout_s)
| Parameter | Default | Description |
|---|---|---|
socket_path |
$DATAERAI_SOCKET or /run/user/<uid>/dataerai-transfer.sock |
Unix socket path |
binary_path |
None |
Path to the dataerai binary (required for auto_start) |
auto_start |
True |
Spawn the daemon if the socket does not exist |
start_timeout_s |
10.0 |
Seconds to wait for the daemon socket to appear |
request_timeout_s |
30.0 |
Per-request timeout in seconds |
Use as a context manager (with DataeraiClient(...) as client:) for automatic cleanup, or call client.connect() / client.close() manually.
Methods
| Method | Returns | Description |
|---|---|---|
auth_status() |
AuthStatus |
Logged-in user and token expiry |
create_project(name, *, description="") |
Project |
Create a project (+ root collection, owner membership, default allocation); requires write scope |
upload(local_path, *, title, owner_type, owner_id, ...) |
UploadResult |
Upload a file; blocks until complete |
download(asset_id, dest_dir, ...) |
DownloadResult |
Download latest asset content; blocks until complete |
get_metadata(asset_id) |
AssetMetadata |
Retrieve asset metadata |
set_metadata(asset_id, **fields) |
AssetMetadata |
Update metadata fields |
create_relationship(from_asset_id, to_asset_id, rel_type=None, *, relationship_type=None, ...) |
Relationship |
Create a directed provenance link between two assets |
create_relationship() arguments
Authors a directed provenance edge from_asset_id → to_asset_id. You need
write access to the source and read access to the target. rel_type is a
free-form verb describing the source's role, e.g. "analysis_of" or
"acquired_with". Notebook integrations can pass the same value with the
keyword-only alias relationship_type.
| Argument | Type | Description |
|---|---|---|
from_asset_id |
str |
Source (dependent) asset — the edge starts here (required) |
to_asset_id |
str |
Target (origin) asset — the edge points here (required) |
rel_type |
str | None |
Free-form relationship type, ≤255 chars (required unless relationship_type is provided) |
relationship_type |
str | None |
Keyword-only alias for rel_type |
analysis_mode |
str | None |
non_destructive, altering, destructive, in_situ, ex_situ, invasive, non_invasive |
qualifier_note |
str | None |
Free-text note on the relationship |
qualifier_time |
str | None |
ISO-8601 timestamp |
qualifiers |
dict | None |
JSON-serializable extra qualifiers |
# Link a processed result back to the raw data it came from.
client.create_relationship(analysis.asset_id, raw.asset_id, "analysis_of",
analysis_mode="non_destructive")
upload() keyword arguments
| Argument | Type | Description |
|---|---|---|
title |
str |
Asset title (required) |
owner_type |
str |
"project" or "user" (required) |
owner_id |
str |
Owner entity ID (required) |
description |
str | None |
Free-text description |
alias |
str | None |
Short identifier |
tags |
list[str] | None |
Tag list |
metadata |
dict | None |
Arbitrary key-value metadata |
collection_id |
str | None |
Collection to add the asset to |
chunk_size_mb |
int | None |
Override default 64 MiB chunk size |
on_progress |
Callable[[ProgressEvent], None] | None |
Progress callback |
transfer_timeout_s |
float |
Max seconds to wait for completion (default 3600) |
Progress events
@dataclass
class ProgressEvent:
transfer_id: str
bytes_done: int
bytes_total: int
chunk_index: int
chunk_count: int
rate_bps: float
file_index: int
file_name: str
@property
def percent(self) -> float: ... # 0–100
@property
def rate_mbps(self) -> float: ...
Error types
| Exception | When raised |
|---|---|
DaemonError(code, message) |
Daemon returned a coded error (see code attribute) |
DaemonTimeoutError |
Request or transfer exceeded the configured timeout |
ConnectionError |
Daemon disconnected unexpectedly |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dataerai_sdk-0.2.0b6.tar.gz.
File metadata
- Download URL: dataerai_sdk-0.2.0b6.tar.gz
- Upload date:
- Size: 243.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd7bacda23d76a7f8f17ca6fa079af1d6ce3a03a81d61609dafd2e600259bf94
|
|
| MD5 |
9bd2b6f9af6d42419fc37895d8aff828
|
|
| BLAKE2b-256 |
6155bf522214e93c1bcce2d0871d7a7eb11ff7635db196aeeb89d54e9f6f79fe
|
File details
Details for the file dataerai_sdk-0.2.0b6-py3-none-any.whl.
File metadata
- Download URL: dataerai_sdk-0.2.0b6-py3-none-any.whl
- Upload date:
- Size: 278.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
518d2041233de3d7c99f0e24b399d8060c7f084a7304cafa84f7a8315ce08d45
|
|
| MD5 |
cc4f6e58c50327902d1e07397882a936
|
|
| BLAKE2b-256 |
448a5974f677330da010cde7bc2513a963885002629c798e2f39f2917c89c30e
|