Skip to main content

OpenAI-compatible API gateway for HPC clusters via Globus Compute

Project description

hpc-as-api

PyPI License Tests

OpenAI-compatible API gateway for HPC clusters via Globus Compute.

hpc-as-api exposes any vLLM-served model running on an HPC cluster (SLURM, PBS, etc.) as a standard OpenAI-compatible REST API. It handles authentication, rate limiting, payload size management, and real-time token streaming — so your existing OpenAI clients work without modification.

from hpc_as_api.compute import GlobusComputeClient

client = GlobusComputeClient(
    endpoint_id="8d978809-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
    models={
        "qwen25-vl-72b": {
            "hf_name": "Qwen/Qwen2.5-VL-72B-Instruct-AWQ",
            "url": "http://ghi2-002:8000",
            "context_reserve_output": 4096,
        }
    },
)
result = await client.submit_inference(
    messages=[{"role": "user", "content": "Explain quantum entanglement."}],
    model="qwen25-vl-72b",
)

Why

HPC clusters run the largest open-source LLMs (72B+ parameters) on GPU hardware that typical cloud users can't afford. But HPC infrastructure has no standard API surface — each cluster has its own SLURM scripts, SSH tunnels, and authentication systems. hpc-as-api provides a uniform OpenAI-compatible interface over any vLLM-served model, using Globus Compute for authentication and job dispatch (no open ports required on the HPC side).

Architecture

Your App / OpenAI Client
        │  POST /v1/chat/completions
        ▼
  hpc-as-api (FastAPI)
        │  Globus Compute (AMQP — no HPC firewall holes)
        ▼
  HPC Cluster (SLURM)
        │  vLLM HTTP API (internal LAN)
        ▼
  GPU Compute Node
        │  tokens flow via WebSocket relay (streamrelay)
        ▼
  hpc-as-api → SSE stream → Your App

Key design points:

  • No open ports on HPC: Globus Compute is outbound-only from the cluster
  • Real-time streaming: Tokens stream back via streamrelay WebSocket relay
  • E2E encryption: Optional AES-256-GCM encryption between HPC and consumer (relay sees only ciphertext)
  • OpenAI-compatible: Drop-in for any client using the OpenAI SDK

Installation

# Base package (no Globus SDK)
pip install hpc-as-api

# With Globus Compute support
pip install "hpc-as-api[globus]"

Quickstart: Run as a service

Set environment variables and start:

export GLOBUS_COMPUTE_ENDPOINT_ID="your-endpoint-uuid"
export HPC_MODELS='{"qwen25-vl-72b": {"hf_name": "Qwen/Qwen2.5-VL-72B-Instruct-AWQ", "url": "http://ghi2-002:8000", "context_reserve_output": 4096}}'
export RELAY_URL="wss://relay.example.com"
export RELAY_SECRET="your-relay-secret"

uvicorn hpc_as_api.app:app --host 0.0.0.0 --port 8001

The gateway is now reachable at http://localhost:8001/v1/chat/completions with the standard OpenAI API schema.

Embed in an existing FastAPI app

from fastapi import FastAPI
from hpc_as_api.app import router

app = FastAPI()
app.include_router(router, prefix="/hpc")

Configuration reference

Variable Default Description
GLOBUS_COMPUTE_ENDPOINT_ID Globus endpoint UUID for the HPC cluster
HPC_MODELS {} JSON dict: model name → HPC config
RELAY_URL WebSocket relay URL for token streaming
RELAY_SECRET Shared secret for relay auth
RELAY_ENCRYPTION_KEY AES-256 hex key for E2E encryption
USE_GLOBUS_COMPUTE true false to route directly via SSH tunnel
LAKESHORE_VLLM_ENDPOINT http://localhost:8000 Direct vLLM URL (SSH mode)
HPC_PROXY_HOST 0.0.0.0 Bind host
HPC_PROXY_PORT 8001 Bind port

HPC_MODELS schema

{
  "my-model-name": {
    "hf_name": "org/ModelName",
    "url": "http://compute-node:8000",
    "context_reserve_output": 4096
  }
}

Authentication

The gateway supports two auth modes (configured in hpc_as_api/auth.py):

  • Globus token: Bearer token from Globus Auth, validated via introspection
  • API key: Static key from HPC_API_KEYS env var (comma-separated)

Development

git clone https://github.com/uicacer/hpc-as-api
cd hpc-as-api
uv sync --extra dev
uv run pytest

Related

  • streamrelay — WebSocket relay for real-time token streaming from Globus Compute
  • STREAM — Full tiered LLM routing system that uses hpc-as-api

License

Apache 2.0 — see LICENSE.

Citation

If you use hpc-as-api in research, please cite:

@software{nassar2025hpcgateway,
  author = {Nassar, Anas},
  title  = {hpc-as-api: OpenAI-compatible API gateway for HPC clusters via Globus Compute},
  year   = {2025},
  url    = {https://github.com/uicacer/hpc-as-api}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hpc_as_api-0.1.0.tar.gz (110.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hpc_as_api-0.1.0-py3-none-any.whl (39.9 kB view details)

Uploaded Python 3

File details

Details for the file hpc_as_api-0.1.0.tar.gz.

File metadata

  • Download URL: hpc_as_api-0.1.0.tar.gz
  • Upload date:
  • Size: 110.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for hpc_as_api-0.1.0.tar.gz
Algorithm Hash digest
SHA256 55124dd60c77d8948b2cd61b0af5af4f0b1fc7c83b1b5fde916609bc7fb6a48f
MD5 78f4718b16713d88644ae6893064492a
BLAKE2b-256 de2348b41d2110b9345acd11db98d6e622fff7e2d6c0b230c364728dc277a44d

See more details on using hashes here.

File details

Details for the file hpc_as_api-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: hpc_as_api-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 39.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.17 {"installer":{"name":"uv","version":"0.9.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for hpc_as_api-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 293c20326584df3fa70d76ed21b74d75c054228e69338f1c69c8ecb5ff9fb259
MD5 afac53c74aa732e5e4f872b7d3501e6f
BLAKE2b-256 e4fc0b14625bb79bfc05ecf59e1c85a167e028621f88738ad28f95182aa3aeb2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page