AmSC Resource Orchestration Client Toolkit
Project description
Table of contents
Description
The American Science Cloud Infrastructure Services Resource Orchestration Toolkit (AmSC-ISRO-Toolkit [AmSCROT]) provides infrastructure orchestration for AmSC use.
Installation
pip install amscrot-py
Operation Instructions
Credentials
AmSCROT reads provider credentials from ~/.amscrot/credentials.yml. Each section key corresponds to a service type or profile name:
esnet-iri:
client_type: AMSC_IRI
api_key: <token>
api_endpoint: https://iri.es.net/api/v1
nersc-iri:
client_type: AMSC_IRI
api_key: <token>
api_endpoint: https://api.iri.nersc.gov/api/v1
amsc-iro:
client_type: AMSC_IRO
api_key: <token>
api_endpoint: https://...
An example credentials file is provided in credentials-template.yml.
Core Concepts
| Class | Role |
|---|---|
Client |
Top-level entry point; owns sessions and service clients |
ServiceClient |
Provider-specific driver (Kube/Kueue, ESnet IRI, NERSC IRI, AMSC-IRO) |
Session |
Named unit of work; groups jobs and persists state to disk |
Job |
A single compute task bound to a ServiceClient |
JobSpec |
Declares executable, arguments, resources, and provider attributes |
DiscoveryResult |
Typed result from service_client.discover() |
Basic Usage
AmSCROT offers three ways to set up service clients: automatic discovery, automatic generation, and manual definition. Each require credentials in ~/.amscrot/credentials.yml.
Option A: Automatic Endpoint Discovery (recommended)
from amscrot.client.client import Client
client = Client(discover_endpoints=True)
With discover_endpoints=True, the Client queries the AmSC IRO facility registry and automatically creates an IriServiceClient for each discovered facility. Credentials are matched by comparing each facility's api_endpoint against your credentials.yml profiles that have client_type: AMSC_IRI:
# ~/.amscrot/credentials.yml
nersc-iri:
client_type: AMSC_IRI
api_key: <token>
api_endpoint: https://api.iri.nersc.gov
alcf-iri:
client_type: AMSC_IRI
api_key: <token>
api_endpoint: https://api.alcf.anl.gov
Token resolution order per discovered facility:
- Credential profile match — if a profile with
client_type: AMSC_IRIhas a matchingapi_endpoint, that profile is used. AMSC_TOKENenv var — if no profile matches, theAMSC_TOKENenvironment variable is used as a fallback.- Skip — if neither is available, the facility is skipped with a warning.
Service clients are named using a slugified version of the facility name (e.g., "Argonne Leadership Computing Facility" → "argonne-leadership-computing-facility"):
# Access auto-discovered clients by name
nersc = client.get_service_client("national-energy-research-scientific-computing-center")
alcf = client.get_service_client("argonne-leadership-computing-facility")
# List all discovered clients
print("Discovered:", [sc.name for sc in client.service_clients])
Option B: Auto-Create from Credentials File
from amscrot.client.client import Client
client = Client(create_service_clients=True)
With create_service_clients=True, the Client reads ~/.amscrot/credentials.yml and creates a ServiceClient for every entry that has a client_type field. Each credential profile becomes a service client named after its YAML key:
# Creates two service clients: "nersc-iri" and "esnet-iri-east"
nersc-iri:
client_type: AMSC_IRI
api_key: <token>
api_endpoint: https://api.iri.nersc.gov
esnet-iri-east:
client_type: AMSC_IRI
api_key: <token>
api_endpoint: https://iri-dev.ppg.es.net
nersc = client.get_service_client("nersc-iri")
east = client.get_service_client("esnet-iri-east")
This mode does not contact any external registry — it works entirely from your local credentials file. Entries without client_type are skipped with a warning.
Option C: Manual ServiceClient Definition
Use a credential profile from credentials.yml:
from amscrot.client.client import Client
from amscrot.serviceclient import ServiceClient
from amscrot.util.constants import Constants
client = Client()
svc = ServiceClient.create(
type=Constants.ServiceType.AMSC_IRI,
name="nersc-compute",
profile="nersc-iri" # matches credentials.yml section
)
client.add_service_client(svc)
Or define credentials entirely in code — no credentials.yml needed:
client = Client()
# Add credentials programmatically
client.add_credential(
profile="my-nersc",
client_type="AMSC_IRI",
api_key="<token>",
api_endpoint="https://api.iri.nersc.gov"
)
svc = ServiceClient.create(
type=Constants.ServiceType.AMSC_IRI,
name="nersc-compute",
profile="my-nersc"
)
client.add_service_client(svc)
With manual setup, you control exactly which service clients exist, their names, and which credential profiles they use. This is useful for testing, notebooks, or when credentials come from environment variables or a secrets manager.
When to Use Each
discover_endpoints |
create_service_clients |
Manual ServiceClient.create() |
|
|---|---|---|---|
| Setup | One line | One line | Explicit per-client |
| Source | IRO facility registry (remote) | credentials.yml (local) |
Code |
| Naming | Auto-slugified from registry | YAML key names | You choose |
| Credentials | Matched by api_endpoint |
Direct from each entry | Explicit profile= |
| Best for | Multi-site, dynamic environments | Stable multi-facility setups | Testing, custom configs |
| Requires | IRO registry reachable | client_type in credentials |
Only the credential profile |
2. Discover Available Resources
result = svc.discover() # returns DiscoveryResult
# Iterate typed resources
for item in result.by_type("compute"):
print(item.data["id"], item.data["name"])
# Normalized Facility objects (provider-agnostic)
for facility in result.facilities:
print(facility.name, [c.cores for c in (facility.compute or [])])
3. Define a Job
from amscrot.client.job import Job, JobSpec, JobType, JobServiceType
spec = JobSpec(
executable="python",
arguments=["-c", "print('hello')"],
resources={
"node_count": 1,
"process_count": 1,
"processes_per_node": 1,
"cpu_cores_per_process": 1,
"exclusive_node_use": False,
"memory": 268435456
},
attributes={
"container": {"image": "python:3.12-slim"}, # provider image
"resource_id": "<compute-resource-id>" # from discovery
}
)
job = Job(
name="my-job",
type=JobType.COMPUTE,
service_type=JobServiceType.BATCH,
service_client=svc,
job_spec=spec
)
4. Create a Session and Submit
session = client.create_session("my-session")
session.add_job(job)
# Validate (raises PlanError on failure)
session.plan(verbose=True)
# Submit all jobs
session.apply()
5. Wait for Completion
from amscrot.client.job import JobState
results = session.wait(
timeout=300,
interval=5,
verbose=True,
)
for job_name, status in results.items():
print(f"{job_name}: {status.state} message={status.message}")
session.wait() polls until all jobs reach a terminal state (COMPLETED, FAILED, or CANCELED). Pass jobs=[job1, job2] to wait on a subset.
6. Clean Up
session.destroy() # cancels running jobs and removes session state
Sessions are persisted to ~/.amscrot/sessions/<session-name>/ so they survive process restarts. An existing session is restored automatically on client.create_session(name).
7. Fetch Output Files (IRI providers)
After a job completes, stdout/stderr can be downloaded from the remote filesystem:
# Include stdout/stderr paths in the job spec attributes
spec = JobSpec(
executable="python",
arguments=["-c", "print('done')"],
attributes={
"resource_id": "<compute-resource-id>",
"directory": "/path/to/workdir", # remote working directory
"stdout_path": "/path/to/workdir/out.log",
"stderr_path": "/path/to/workdir/err.log",
}
)
# After session.wait() returns COMPLETED:
fetched = session.fetch_output_files(jobs=[job])
# fetched == {"my-job": {"stdout": "/local/.amscrot/sessions/my-session/files/my-job/stdout.log",
# "stderr": "/local/.amscrot/sessions/my-session/files/my-job/stderr.log"}}
Files are written to ~/.amscrot/sessions/<session-name>/files/<job-name>/ by default. Pass output_path= to override.
8. Direct Filesystem Access (IRI providers)
ESnet IRI and NERSC IRI service clients expose an IriFilesystem interface for direct file operations independent of job submission:
fs = svc.filesystem # IriFilesystem instance (None if client unavailable)
# Upload a local file to remote storage
fs.upload(storage_resource_id, local_path="/tmp/input.txt", remote_path="/scratch/input.txt")
# Download a remote file
fs.download(storage_resource_id, remote_path="/scratch/out.log", local_path="/tmp/out.log")
# List a remote directory
entries = fs.list(storage_resource_id, remote_path="/scratch/")
# Compute checksum of a remote file
checksum = fs.checksum(storage_resource_id, remote_path="/scratch/data.tar")
# Create a tar archive of a remote directory
fs.compress(storage_resource_id, remote_path="/scratch/results/", archive_path="/scratch/results.tar.gz")
storage_resource_id is the UUID of a storage resource from svc.discover(). For most IRI deployments, the home storage resource is auto-resolved when calling session.fetch_output_files().
Kubernetes / Kueue Jobs
spec = JobSpec(
executable="sleep",
arguments=["30"],
resources={"requests": {"cpu": "1", "memory": "1Gi"}},
attributes={
"container": {"image": "busybox"},
"namespace": "default",
"labels": {"kueue.x-k8s.io/queue-name": "compute-queue"},
"completions": 1,
"restartPolicy": "Never"
}
)
See scripts/kube/setup-kueue.sh to install Kueue and create the required ResourceFlavor, ClusterQueue, LocalQueue, and PriorityClass resources on your cluster.
Jupyter Notebook Examples
Interactive notebooks are provided under examples/notebooks/client/.
| Notebook | Description |
|---|---|
| amsc_hello_world | Start here. Walks through installation, credential setup, creating a Client/Session, submitting a job, monitoring with session.wait(), and cleanup with session.destroy(). |
| amsc_gpt2_training_job | Submits a GPT-2 training job to a remote IRI compute resource, polls for completion, and fetches log output. |
| amsc_iri_multisite | Demonstrates multi-site job submission across ESnet IRI East and West endpoints using a single session. |
| amsc_iro_net_xfer | Uses the AMSC-IRO backend to orchestrate networked data-transfer jobs with L2 network metadata. |
Apache Airflow Support
The airflow/ subdirectory provides custom Airflow operators for submitting and monitoring IRI compute jobs as part of larger data pipelines.
Operators
IriJobSubmitOperator— Plans, submits, and waits for an IRI job. Acceptsservice_type,profile,executable,resources, andattributes. Pushesjob_idto XCom on completion.IriFetchOutputOperator— Downloads stdout/stderr from a completed job to a local directory.
Quick Start
cd airflow/
bash setup_airflow.sh --start # installs deps, initialises DB, launches standalone Airflow
Open http://localhost:8080, configure credentials in ~/.amscrot/credentials.yml, then trigger the demo DAG (esnet_iri_example or gpt2_training_job) from the UI or via the REST API:
curl -X POST http://localhost:8080/api/v2/dags/esnet_iri_example/dagRuns \
-H "Content-Type: application/json" \
-u "admin:<password>" -d '{}'
See airflow/README.md for the full operator reference and configuration options.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file amscrot_py-1.0.0.post18.tar.gz.
File metadata
- Download URL: amscrot_py-1.0.0.post18.tar.gz
- Upload date:
- Size: 211.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8638fe22b1eb3c3fdf6d0270b4abb0fb661289e5eb4dd86921e08fcc103f9e2a
|
|
| MD5 |
3479c210386d6e8e4ba1c4b2f9d46d85
|
|
| BLAKE2b-256 |
bd5fcbf61116f78a2939c02a9d75aa4a3f72bb438e2d9e1e69facfe2e529b551
|
File details
Details for the file amscrot_py-1.0.0.post18-py3-none-any.whl.
File metadata
- Download URL: amscrot_py-1.0.0.post18-py3-none-any.whl
- Upload date:
- Size: 270.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
da7117b572afba74dd15936fdd62fa6aa2e4a93b00a139174b1f9e397a65f606
|
|
| MD5 |
7320888b6ac87377fe5b2acd78e62064
|
|
| BLAKE2b-256 |
4cdb687dc8215b0a70d5ff0ae128d67975cf434fd51c270726b3f2fa44312c48
|