Skip to main content

Quapp HPC library — Slurm integration for Quapp Platform

Project description

quapp-hpc

Python library cho Quapp HPC functions — cầu nối giữa Quapp FaaS platform và Slurm HPC cluster.

Architecture

ksvc (Docker)
├── index.py                  FastAPI server
├── quapp_hpc/
│   ├── factory/
│   │   └── hpc_handler_factory.py   Entry point cho user
│   ├── component/backend/
│   │   └── hpc_invocation.py        Orchestrates job lifecycle
│   └── model/
│       ├── provider/slurm_provider.py   Auth headers, base URL
│       └── device/slurm_device.py       Submit → Poll → S3 download
└── function/
    └── handler.py            User-defined processing() + post_processing()

Luồng thực thi

index.py nhận HTTP POST
    → HpcHandlerFactory.create_handler(event, processing_fn, post_processing_fn)
    → InvocationHandler.handle()
    → HpcInvocation.submit_job()
        1. processing_fn(invocation_input)  → bash script string
        2. SlurmDevice._create_job(script)  → POST Slurm REST API → slurm_job_id
        3. SlurmDevice._get_job_result()    → poll mỗi 30s → COMPLETED/FAILED
        4. SlurmDevice._download_s3_result()→ boto3 get s3://$S3_BUCKET/$JOB_UUID/output.json
        5. post_processing_fn(s3_result)    → final response

Environment variables (từ K8s Secret slurm-credentials)

Var Ví dụ Mô tả
SLURM_API_URL http://10.1.0.15:6820 Slurm REST API base URL
SLURM_JWT eyJ... JWT token cho Slurm auth
SLURM_USERNAME quapp-svc Slurm username
SLURM_ACCOUNT quapp Slurm account/allocation
S3_BUCKET quapp-slurm-output-dev S3 bucket cho job output
AWS_REGION ap-southeast-1 AWS region
SLURM_POLL_SEC 30 Polling interval (giây)
SLURM_TIMEOUT_SEC 21600 Max wait time (giây, default 6h)

invocation_input schema

Xem chi tiết tại ../qapp-sdk-templates/slurm-hpc/README.md.

Tóm tắt:

{
  "resources": { "partition", "nodes", "cpus_per_task", "gpus", "memory_gb", "time_limit" },
  "container": { "type": "sif"|"docker"|"none", "image": "..." },
  "job":       { "type": "script"|"command", "script"|"command": "...", "environment": {}, "input_s3_paths": [] }
}

Slurm REST API

  • Version: v0.0.40
  • Submit: POST {SLURM_API_URL}/slurm/v0.0.40/job/submit
  • Status: GET {SLURM_API_URL}/slurm/v0.0.40/job/{job_id}
  • Auth headers: X-SLURM-USER-NAME, X-SLURM-USER-TOKEN

Job state mapping

Slurm state Quapp state
PENDING, CONFIGURING, RUNNING, COMPLETING RUNNING
COMPLETED DONE
FAILED, CANCELLED, TIMEOUT, NODE_FAIL, PREEMPTED ERROR

S3 output pattern

Job script phải upload kết quả:

aws s3 cp /tmp/output.json s3://$S3_BUCKET/$JOB_UUID/output.json

$JOB_UUID$S3_BUCKET được inject tự động bởi SlurmDevice._create_job() qua Slurm environment array.

K8s Secret

# infrastructure/quapp-job-scheduler/k8s/cts/slurm-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: slurm-credentials
  namespace: quapp-functions-dev
stringData:
  SLURM_JWT: "<generate: sudo scontrol token username=quapp-svc lifespan=2592000>"
  SLURM_API_URL: "http://10.1.0.15:6820"
  SLURM_USERNAME: "quapp-svc"
  SLURM_ACCOUNT: "quapp"
  S3_BUCKET: "quapp-slurm-output-dev"
  AWS_REGION: "ap-southeast-1"

DB seed required

Chạy script trước khi deploy:

infrastructure/quapp-functions-backend/docs/db/seed_slurm_hpc.sql

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quapp_hpc-0.0.1.dev2.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

quapp_hpc-0.0.1.dev2-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file quapp_hpc-0.0.1.dev2.tar.gz.

File metadata

  • Download URL: quapp_hpc-0.0.1.dev2.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for quapp_hpc-0.0.1.dev2.tar.gz
Algorithm Hash digest
SHA256 b003f7458f03953687f7960ad9ef38fab696aa3f130b062e14326a722fd9b21d
MD5 cbe232a4e5440b2afe0d7918909e1b00
BLAKE2b-256 d4f5c1f9d13bef1c2c4bc0da603f194af1509dfb6ef09309c06e9c8bce52eecf

See more details on using hashes here.

File details

Details for the file quapp_hpc-0.0.1.dev2-py3-none-any.whl.

File metadata

  • Download URL: quapp_hpc-0.0.1.dev2-py3-none-any.whl
  • Upload date:
  • Size: 14.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for quapp_hpc-0.0.1.dev2-py3-none-any.whl
Algorithm Hash digest
SHA256 b159a317695bdc7e9c9e12c4588b65aa96534a66a8f0046d0dc5de0fb7e577fa
MD5 219088221ba7bc2723b339ebcc06771d
BLAKE2b-256 3d8469515561d1f3807cf1c062b61e4bfd3e9f57feaf839e36d42101a4452fb7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page