A lightweight GPU runtime for Python with NVRTC JIT compilation and NumPy-like API

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

m96-chan

These details have not been verified by PyPI

Project description

PyGPUkit — Lightweight GPU Runtime for Python

A minimal, modular GPU runtime with NVRTC JIT compilation, GPU scheduling, and a clean NumPy-like API.

🚀 Overview

PyGPUkit is a lightweight GPU runtime for Python that provides:

NVRTC-based JIT kernel compilation
A NumPy-like GPUArray type
Kubernetes-inspired GPU scheduler (bandwidth + memory guarantees)
Extensible operator set (add/mul/matmul, custom kernels)
Minimal dependencies and embeddable runtime

PyGPUkit aims to be the “micro-runtime for GPU computing”: small, fast, and ideal for research, inference tooling, DSP, and real-time systems.

✨ Features

⚡ Lightweight — no PyTorch/CuPy overhead
🧩 Modular — runtime / memory / scheduler / JIT / ops
📦 GPUArray with NumPy interop
🛠 NVRTC JIT for CUDA kernels
🎼 Advanced Scheduler with memory & bandwidth guarantees
🔌 Optional Triton backend (planned)
🧪 Test-friendly runtime

🔧 Installation

(Available after first PyPI release)

pip install pygpukit

From source:

git clone https://github.com/m96-chan/PyGPUkit
cd PyGPUkit
pip install -e .

Requirements:

Python 3.9+
CUDA 11+
NVRTC available
NVIDIA GPU

🧭 Project Goals

Provide the smallest usable GPU runtime for Python
Expose GPU scheduling (bandwidth, memory, partitioning)
Make writing custom GPU kernels easy
Serve as a building block for inference engines, DSP systems, and real-time workloads

📚 Usage Examples

Allocate Arrays

import pygpukit as gp

x = gp.zeros((1024, 1024), dtype="float32")
y = gp.ones((1024, 1024), dtype="float32")

Basic Operations

z = gp.add(x, y)
w = gp.matmul(x, y)

CPU ↔ GPU Transfer

arr = z.to_numpy()
garr = gp.from_numpy(arr)

Custom NVRTC Kernel

extern "C" __global__
void scale(float* x, float factor, int n) {
    int idx = blockIdx.x * blockDim.x + threadIdx.x;
    if (idx < n) x[idx] *= factor;
}

kernel = gp.jit(src, func="scale")
kernel(x, factor=0.5, n=x.size)

🎼 Scheduler — Kubernetes‑Inspired GPU Orchestration

PyGPUkit includes an experimental scheduler that treats a single GPU as a multi-tenant compute node, similar to how Kubernetes orchestrates CPU workloads. The goal is to provide resource isolation, guarantees, and fair sharing across multiple GPU tasks.

Core Capabilities

1. GPU Memory Reservation

Tasks may request a guaranteed block of GPU memory.

Hard guarantees → task is rejected if memory cannot be allocated
Soft guarantees → best‑effort allocation
Overcommit strategies (evict to host when pressure is high)
Reclaim policies (LRU GPUArray eviction)

Example:

task = scheduler.submit(
    fn,
    memory="512MB",
)

2. GPU Bandwidth Guarantees / Throttling

Tasks may request a specific percentage of GPU compute bandwidth.

Bandwidth control is implemented via:

Stream priority
Kernel pacing (launch intervals)
Micro‑slicing large kernels
Cooperative time‑quantized scheduling
Persistent dispatcher kernels (planned)

Example:

task = scheduler.submit(
    fn,
    bandwidth=0.20,   # 20% GPU compute share
)

3. Logical GPU Partitioning

PyGPUkit implements software‑defined GPU slicing, similar in spirit to Kubernetes device plugin resource partitioning.

Slices may define:

Memory quota
Bandwidth share
Stream priority band
Isolation level

Useful for:

Multi‑tenant inference servers
Real‑time audio/DSP workloads
Background/foreground GPU task separation

4. Scheduling Policies

The scheduler supports multiple policies:

Guaranteed — exclusive reservation, strict QoS
Burstable — partial guarantees, opportunistic bandwidth
BestEffort — uses leftover GPU cycles
Priority scheduling
Deadline scheduling (planned)
Weighted fair sharing

Example:

task = scheduler.submit(
    fn,
    policy="guaranteed",
    memory="1GB",
    bandwidth=0.10,
)

5. Admission Control

Before executing a task, the scheduler performs:

Resource validation
Quota check
QoS matching
Scheduling feasibility

Results in:

admitted
queued
rejected

6. Monitoring & Introspection

PyGPUkit exposes live metrics:

Memory usage per task
SM occupancy and GPU utilization
Throttling / pacing logs
Queue position / execution state
Reclaim/eviction count

Example:

stats = scheduler.stats(task_id)

7. Soft Isolation Model

While not OS‑level isolation, each GPU task is provided:

Dedicated stream groups
Guaranteed memory pools
Kernel pacing to enforce bandwidth
Optional sandboxed GPUArray region

This provides practical multi‑tenant safety without MIG/MPS.

🏗 Proposed Directory Structure

PyGPUkit/
  core/         # NVRTC wrapper, device info
  memory/       # GPUArray, allocators
  scheduler/    # orchestration, partitioning, throttling
  ops/          # built-in kernels
  jit/          # JIT compiler + cache
  python/       # high-level Python API
  examples/
  tests/

🧪 Roadmap

v0.1 (MVP)

GPUArray
NVRTC JIT
add/mul/matmul ops
Basic stream manager
Packaging + wheels

v0.2

Scheduler (memory + bandwidth guarantees)
Kernel cache
NumPy interop
Benchmarks

v0.3

Triton optional backend
Advanced ops (softmax, layernorm)
Inference‑oriented plugin system

🤝 Contributing

Contributions and discussions are welcome!
Please open Issues for feature requests, bugs, or design proposals.

📄 License

MIT License

⭐ Acknowledgements

Inspired by:

CUDA Runtime
NVRTC
PyCUDA
CuPy
Triton

PyGPUkit aims to fill the gap for a tiny, embeddable GPU runtime for Python.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

m96-chan

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.19

Jan 1, 2026

0.2.18

Dec 30, 2025

0.2.17

Dec 28, 2025

0.2.16

Dec 28, 2025

0.2.15

Dec 26, 2025

0.2.14

Dec 23, 2025

0.2.13

Dec 23, 2025

0.2.12

Dec 22, 2025

0.2.11

Dec 22, 2025

0.2.10

Dec 18, 2025

0.2.9

Dec 16, 2025

0.2.8

Dec 15, 2025

0.2.7

Dec 15, 2025

0.2.6

Dec 15, 2025

0.2.5

Dec 15, 2025

0.2.4

Dec 14, 2025

0.2.3

Dec 14, 2025

0.2.2

Dec 13, 2025

0.2.0

Dec 12, 2025

0.1.3

Dec 12, 2025

This version

0.1.1

Dec 12, 2025

0.1.0

Dec 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygpukit-0.1.1.tar.gz (28.0 kB view details)

Uploaded Dec 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pygpukit-0.1.1-py3-none-any.whl (18.5 kB view details)

Uploaded Dec 12, 2025 Python 3

File details

Details for the file pygpukit-0.1.1.tar.gz.

File metadata

Download URL: pygpukit-0.1.1.tar.gz
Upload date: Dec 12, 2025
Size: 28.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pygpukit-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`ecee561dd82aa45a836d012861aa3428815ff3cb3ebefc7346a6260a9bc44fdd`
MD5	`497592a246439fb9e311330d8b870c00`
BLAKE2b-256	`021110394d41ed9d20aea8027b45af7ce21fe257bb34877d55fc3f054e974ed7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pygpukit-0.1.1.tar.gz:

Publisher: release.yml on m96-chan/PyGPUkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pygpukit-0.1.1.tar.gz
- Subject digest: ecee561dd82aa45a836d012861aa3428815ff3cb3ebefc7346a6260a9bc44fdd
- Sigstore transparency entry: 760659253
- Sigstore integration time: Dec 12, 2025
Source repository:
- Permalink: m96-chan/PyGPUkit@7700fa75c3895eca79c8e931d17a52261a35f864
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/m96-chan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7700fa75c3895eca79c8e931d17a52261a35f864
- Trigger Event: push

File details

Details for the file pygpukit-0.1.1-py3-none-any.whl.

File metadata

Download URL: pygpukit-0.1.1-py3-none-any.whl
Upload date: Dec 12, 2025
Size: 18.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pygpukit-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`40456ca27654fb123b30b853d4b962ff19d1910190acca16f2705736712a9938`
MD5	`5c26390d2fec9d2a9c17d4522b7ee338`
BLAKE2b-256	`1ebd37cdb7edd049922aa5dec84f2929c2dc33329eebd9e94ba29750aaca3fe8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pygpukit-0.1.1-py3-none-any.whl:

Publisher: release.yml on m96-chan/PyGPUkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pygpukit-0.1.1-py3-none-any.whl
- Subject digest: 40456ca27654fb123b30b853d4b962ff19d1910190acca16f2705736712a9938
- Sigstore transparency entry: 760659254
- Sigstore integration time: Dec 12, 2025
Source repository:
- Permalink: m96-chan/PyGPUkit@7700fa75c3895eca79c8e931d17a52261a35f864
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/m96-chan
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7700fa75c3895eca79c8e931d17a52261a35f864
- Trigger Event: push

PyGPUkit 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

PyGPUkit — Lightweight GPU Runtime for Python

🚀 Overview

✨ Features

🔧 Installation

🧭 Project Goals

📚 Usage Examples

Allocate Arrays

Basic Operations

CPU ↔ GPU Transfer

Custom NVRTC Kernel

🎼 Scheduler — Kubernetes‑Inspired GPU Orchestration

Core Capabilities

1. GPU Memory Reservation

2. GPU Bandwidth Guarantees / Throttling

3. Logical GPU Partitioning

4. Scheduling Policies

5. Admission Control

6. Monitoring & Introspection

7. Soft Isolation Model

🏗 Proposed Directory Structure

🧪 Roadmap

v0.1 (MVP)

v0.2

v0.3

🤝 Contributing

📄 License

⭐ Acknowledgements

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance