Official Python WebSocket client for Triton Client Manager.
Project description
tcm-client (Python SDK for Triton Client Manager)
This package provides a small, official Python SDK for talking to the Triton Client Manager WebSocket API.
It wraps the /ws endpoint with a high-level client (TcmWebSocketClient)
and helpers like quickstart_queue_stats so you can integrate without
vendoring code from the server repository.
Installation
Install from TestPyPI (preferred index for this SDK at the moment):
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install \
--index-url https://test.pypi.org/simple/ \
--extra-index-url https://pypi.org/simple \
tcm-client
Quickstart
import asyncio
from tcm_client import AuthContext, TcmWebSocketClient
async def main() -> None:
uri = "ws://127.0.0.1:8000/ws"
ctx = AuthContext(
uuid="sdk-quickstart-client",
token="opaque-or-jwt-token",
sub="user-sdk",
tenant_id="tenant-sdk",
roles=["inference", "management"],
)
async with TcmWebSocketClient(uri, ctx) as client:
await client.auth()
info = await client.info_queue_stats()
print(info)
if __name__ == "__main__":
asyncio.run(main())
API overview
TcmWebSocketClient provides a small set of focused methods that mirror the
main WebSocket flows:
| Method | Description |
|---|---|
auth() |
Sends the initial auth message (with token + client block when provided) and expects an auth.ok response. |
info_queue_stats() |
Sends an info message with action: "queue_stats" and returns the full info_response. |
management_creation(action=\"creation\", **kwargs) |
Sends a management message; you pass the action and any OpenStack/Docker/MinIO fields via kwargs. |
inference_http(vm_id, container_id, model_name, inputs) |
Sends an HTTP inference request routed to a specific Triton server (vm_id + container_id) with typed inputs entries. |
The low-level JSON contracts for these flows are documented in
docs/WEBSOCKET_API.md / docs/API_CONTRACTS.md.
Examples
Management – creation flow
import asyncio
from tcm_client import AuthContext, TcmWebSocketClient
async def main() -> None:
uri = "ws://127.0.0.1:8000/ws"
ctx = AuthContext(
uuid="sdk-management-client",
token="opaque-or-jwt-token",
sub="user-management",
tenant_id="tenant-mgmt",
roles=["management"],
)
async with TcmWebSocketClient(uri, ctx) as client:
await client.auth()
resp = await client.management_creation(
action="creation",
openstack={
"vm_name": "demo-vm",
"image": "ubuntu-22.04",
"flavor": "m1.medium",
},
docker={
"image": "nvcr.io/nvidia/tritonserver:23.08-py3",
"command": "tritonserver --model-repository=/models",
},
minio={
"bucket": "models",
"prefix": "example-model/",
},
)
payload = resp.get("payload", {})
if payload.get("status") is True:
print("Management creation OK:", payload.get("data"))
else:
print("Management creation FAILED:", payload.get("data"))
if __name__ == "__main__":
asyncio.run(main())
Management – deletion flow (flat payload)
Deletion accepts both nested (openstack.vm_id, docker.container_id) and flat fields. This example shows the flat form:
import asyncio
from tcm_client import AuthContext, TcmWebSocketClient
async def main() -> None:
uri = "ws://127.0.0.1:8000/ws"
ctx = AuthContext(
uuid="sdk-deletion-client",
token="opaque-or-jwt-token",
sub="user-deletion",
tenant_id="tenant-mgmt",
roles=["management"],
)
async with TcmWebSocketClient(uri, ctx) as client:
await client.auth()
resp = await client.management_creation(
action="deletion",
vm_id="openstack-vm-uuid",
container_id="docker-container-id",
vm_ip="10.0.0.10",
)
payload = resp.get("payload", {})
if payload.get("status") is True:
print("Management deletion OK:", payload.get("data"))
else:
print("Management deletion FAILED:", payload.get("data"))
if __name__ == "__main__":
asyncio.run(main())
Inference – HTTP flow
import asyncio
from tcm_client import AuthContext, TcmWebSocketClient
async def main() -> None:
uri = "ws://127.0.0.1:8000/ws"
ctx = AuthContext(
uuid="sdk-inference-client",
token="opaque-or-jwt-token",
sub="user-inference",
tenant_id="tenant-inf",
roles=["inference"],
)
async with TcmWebSocketClient(uri, ctx) as client:
await client.auth()
inputs = [
{"name": "input_0", "type": "TYPE_FP32", "dims": 4, "value": [1.0, 2.0, 3.0, 4.0]},
]
resp = await client.inference_http(
vm_id="openstack-vm-uuid",
container_id="docker-container-id",
model_name="example-model",
inputs=inputs,
)
payload = resp.get("payload", {})
status = payload.get("status")
if status == "COMPLETED":
print("Inference result:", payload.get("data"))
else:
# Typical error handling path: log message, maybe retry or reconnect.
print("Inference FAILED:", payload.get("data"))
if __name__ == "__main__":
asyncio.run(main())
Error handling and reconnect guidance
- The server can respond with
{"type": "error", "payload": {"message": "..."}}on protocol and policy failures. TcmWebSocketClientraisesRuntimeErrorwhen a response doesn't match the expected flow (for example,auth()not returningauth.ok).
Recommended pattern:
try:
async with TcmWebSocketClient(uri, ctx) as client:
await client.auth()
info = await client.info_queue_stats()
except Exception as exc:
# Log and reconnect with backoff in your integration.
print("SDK call failed:", exc)
For full API contract details (message format, types and examples), see the main project documentation in the main repository:
- WebSocket contract:
docs/WEBSOCKET_API.md - Architecture and runtime:
docs/ARCHITECTURE.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tcm_client-0.1.0.tar.gz.
File metadata
- Download URL: tcm_client-0.1.0.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
056f871541cdd1af9d3a5c756a4c9902fc0b0f8c2670f9c86cee010c027f2216
|
|
| MD5 |
a17f2c204ffb164c5ab7a6ac19b6d4d3
|
|
| BLAKE2b-256 |
30888846a7db8366ecae2ea15cfd3813d794ff8fde122680492c482dfb332b06
|
File details
Details for the file tcm_client-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tcm_client-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd406c5f78dda439e73e329ff9140ccab7358aa3dad12fd339799ffd279daa08
|
|
| MD5 |
eec0030870f1e37e4964dbfb74960d7b
|
|
| BLAKE2b-256 |
86d148b3bf05ccbddfc5041ecd89541490feb613603c8e216f6f060479639cca
|