Skip to main content

Python Kubernetes DRA (Dynamic Resource Allocation) plugins

Project description

pydra

pydra is a high-performance, resilient Python framework for building Kubernetes Dynamic Resource Allocation (DRA) hardware drivers.

By handling the intricate, low-level Kubernetes gRPC node plumbing natively in Python, pydra eliminates the need for hardware vendors to maintain complex Go codebases or fragile Cgo wrappers just to expose their chips to the cluster control plane.

Overview

Traditional Kubernetes device plugins require Go. However, the AI hardware ecosystem—encompassing PJRT, OpenXLA, PyTorch, JAX, and vendor monitoring tools—is natively Python-centric. pydra bridges this gap, allowing infrastructure engineers to write production-grade, topology-aware scheduling drivers utilizing the exact same Python SDKs running the AI workloads.

Architecture: Microkernel Design

pydra enforces a strict separation between Kubernetes protocol mechanics and raw silicon management.

[ Kubernetes Kubelet ]
             |
             | (gRPC over Unix Domain Socket)
             v
+-------------------------------------------------------+
|               pydra-core (The Library)                |
|                                                       |
|  - UDS gRPC Server Engine    - Unix Signal Handling   |
|  - Kubelet Plugin Registry   - Retries & Backoffs     |
|  - CDI Spec Validator        - Robust Error Boundary  |
+-------------------------------------------------------+
            |
            | (Python Abstract Base Class / Inheritance)
            v
+-------------------------------------------------------+
|            Hardware Drivers (Independent)             |
|                                                       |
|   pydra-tpu          pydra-nvidia         pydra-amd   |
|  (Imports JAX/SDK)  (Imports NVML)      (Imports SMI) |
+-------------------------------------------------------+

1. pydra-core

The engine of the framework. It operates completely agnostic of specific hardware types.

  • Resilient UDS Server: Manages connection lifecycles, socket cleanups on termination, and maps incoming Kubelet DRA requests into structured Python primitives.
  • Exception Shielding: If a hardware vendor's underlying C-library throws a segmentation fault or an unhandled exception during allocation, pydra-core catches it, emits a high-fidelity diagnostic trace, and reports a clear TerminalError back to the Kubelet to prevent hung pods.
  • CDI Generator: Provides a fluid API to assemble and validate Container Device Interface (CDI) v1.1.0 specs before writing them to the node.

2. pydra-plugins

Lean, independent packages that inherit from the core.

  • Deep Telemetry: Queries the physical hardware directly via native SDKs (libtpu.sdk, pynvml, etc.) to expose HBM memory capacity, link errors, and real-time topology layout back to the scheduler via ResourceSlices.
  • Custom Slicing Logic: Translates generic user scheduling requests into exact hardware configurations (e.g., configuring an NVIDIA MIG profile or partitioning a TPU v5e mesh topology).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kubernetes_pydra-0.1.0.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kubernetes_pydra-0.1.0-py3-none-any.whl (20.2 kB view details)

Uploaded Python 3

File details

Details for the file kubernetes_pydra-0.1.0.tar.gz.

File metadata

  • Download URL: kubernetes_pydra-0.1.0.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kubernetes_pydra-0.1.0.tar.gz
Algorithm Hash digest
SHA256 082d672f3999bf19814f4da0d5183db39c07ae2bf38ed9958a72825d230e6a88
MD5 a4530b43d680cc93459791d9f5225e76
BLAKE2b-256 06742c46e0e50938fbea7e715c2d4fda18e5f50ed7937ca9c348653ba08fad8e

See more details on using hashes here.

Provenance

The following attestation bundles were made for kubernetes_pydra-0.1.0.tar.gz:

Publisher: publish.yaml on aojea/pydra

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kubernetes_pydra-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for kubernetes_pydra-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 14f07f2b0a89939bc801e2efdbee7f86d2c7224d14a1fa31768206fba96feb73
MD5 c0f152234ce708e008f3864dd8ec366e
BLAKE2b-256 53c672fae92fe67394c31e2de4dc8a004a80e1b3d785cdb623e988bc2c22f057

See more details on using hashes here.

Provenance

The following attestation bundles were made for kubernetes_pydra-0.1.0-py3-none-any.whl:

Publisher: publish.yaml on aojea/pydra

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page