Skip to main content

Generic SLURM dispatch (srun, sbatch, sync, poll, fetch) for the SciTeX ecosystem — login nodes never run compute

Project description

scitex-hpc

PyPI Python Tests Install Test Coverage Docs License: AGPL v3

Generic SLURM dispatch for the SciTeX ecosystem — srun / sbatch / sync / poll_job / fetch_result with sane defaults for spartan/sapphire and override knobs for any other cluster.

Login nodes never run compute — every command is wrapped in srun or sbatch via a login-shell SSH so the SLURM module loads correctly.

Install

pip install scitex-hpc

Usage

from scitex_hpc import JobConfig, srun, sbatch, sync, poll_job, fetch_result

cfg = JobConfig(
    project="scitex-dsp",
    command="pip install -e '.[dev]' -q && python -m pytest tests/ -n 16",
    host="spartan",
    partition="sapphire",
    cpus=16,
    time="00:30:00",
    mem="64G",
)

# 1. Sync local sources to the cluster.
sync(cfg)

# 2a. Blocking interactive run via srun.
exit_code = srun(cfg)

# 2b. Async batch submission via sbatch.
job_id = sbatch(cfg)
print(poll_job(cfg, job_id))   # {'state': 'COMPLETED', 'exit_code': '0:0', 'elapsed': '00:01:23'}
fetch_result(cfg, job_id)      # downloads the .out file

Reservations (book once, exec many)

For workflows where queue wait dominates iteration time — multi-agent fleets, distributed test runners, jupyter-on-HPC — book a node once and run many short commands inside its allocation:

from scitex_hpc import JobConfig, Reservation

# Book a 7-day allocation
res = Reservation.book(
    JobConfig(
        project="dev-pool",
        host="spartan",
        partition="cascade",
        cpus=8, mem="32G", time="7-0",
    ),
    persistent=True,        # walltime auto-resubmit (Phase 2)
)

# Run many commands inside the SAME allocation — no queue wait
res.exec("hostname")                          # → "spartan-bm022.hpc..."
res.exec(["python", "-m", "unittest", "discover"])
res.exec("tmux new -d -s helper claude --dangerously-skip-permissions")

# Open an interactive shell on the compute node
res.attach(cmd="bash")

# Or look up later by friendly name (state lives in ~/.scitex/hpc/leases/)
res = Reservation.get("dev-pool")
res.release()                                 # scancel + clear state

Equivalent CLI:

scitex-hpc reservations book dev-pool --host spartan --cpus 8 --mem 32G --time 7-0 --persistent
scitex-hpc reservations list
scitex-hpc reservations exec dev-pool 'hostname'
scitex-hpc reservations attach dev-pool
scitex-hpc reservations release dev-pool

Compatible with bastion-only HPC policies. No daemons, no tunnels, no crontab @reboot. Every exec() is a fresh ssh round-trip. SSH ControlMaster pooling on the calling host amortizes the handshake cost.

Defaults & overrides

Every JobConfig field has a SCITEX_HPC_* env-var override:

Field Default Env override
host spartan SCITEX_HPC_HOST
partition sapphire SCITEX_HPC_PARTITION
cpus 16 SCITEX_HPC_CPUS
time 00:20:00 SCITEX_HPC_TIME
mem 128G SCITEX_HPC_MEM
remote_base ~/proj SCITEX_HPC_REMOTE_BASE

Resolution priority: explicit JobConfig field → env var → built-in default.

Status

Standalone module from the SciTeX ecosystem. Public API surfaces in scitex.hpc (via the umbrella package's sys.modules alias) so you can write from scitex.hpc import srun from any consumer.

License

AGPL-3.0-only.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scitex_hpc-0.3.0.tar.gz (32.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scitex_hpc-0.3.0-py3-none-any.whl (28.2 kB view details)

Uploaded Python 3

File details

Details for the file scitex_hpc-0.3.0.tar.gz.

File metadata

  • Download URL: scitex_hpc-0.3.0.tar.gz
  • Upload date:
  • Size: 32.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scitex_hpc-0.3.0.tar.gz
Algorithm Hash digest
SHA256 ad4e2ee50ee44e94b7690425eada81634684539294d546663d99d4a512f66169
MD5 c6886a62c7b6d3772d3f751308b4f96a
BLAKE2b-256 93e61b750378f54407f462ce7f4e0f2d57957f63ac69a193ac424f59332a6963

See more details on using hashes here.

Provenance

The following attestation bundles were made for scitex_hpc-0.3.0.tar.gz:

Publisher: publish-pypi.yml on ywatanabe1989/scitex-hpc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scitex_hpc-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: scitex_hpc-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 28.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scitex_hpc-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 63a78ef3b5dd1ca1d23a5557bb17e1ee48d56f3cb634f7bb0483e36299124ff5
MD5 dde9f6496108057e435c3eaf936126e6
BLAKE2b-256 6b8dfb19373a8c6759028de2eb3ce1c5c3a50674735c1c5d3365b190e4cb3721

See more details on using hashes here.

Provenance

The following attestation bundles were made for scitex_hpc-0.3.0-py3-none-any.whl:

Publisher: publish-pypi.yml on ywatanabe1989/scitex-hpc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page