PTX Inject and Stack PTX for Python
Project description
mm-ptx (Python)
PTX Inject and Stack PTX with Python bindings.
This package ships two small, header-only C libraries plus Python wrappers:
- PTX Inject: find marked sites in PTX and inject your own PTX at those sites.
- Stack PTX: generate PTX stubs you can inject at those sites.
PTX Inject: what you write
Mark a site in CUDA with macros:
#include <ptx_inject.h>
extern "C"
__global__
void kernel(float* out) {
float x = 5.0f;
float y = 3.0f;
float z = 0.0f;
PTX_INJECT("func",
PTX_IN (F32, x, x),
PTX_MOD(F32, y, y),
PTX_OUT(F32, z, z)
);
out[0] = z;
}
Compile the CUDA to PTX (nvcc or cuda.core), then build and inject a stub in Python:
from mm_ptx.ptx_inject import PTXInject
annotated_ptx = "..." # PTX from nvcc/cuda.core
inject = PTXInject(annotated_ptx)
func = inject["func"]
stub = (
f"\tadd.ftz.f32 %{func['y'].reg}, %{func['x'].reg}, %{func['y'].reg};\n"
f"\tadd.ftz.f32 %{func['z'].reg}, %{func['x'].reg}, %{func['y'].reg};"
)
final_ptx = inject.render_ptx({"func": stub})
This would be equivalent to writing this CUDA kernel directly but without the CUDA to PTX compilation overhead:
extern "C"
__global__
void kernel(float* out) {
float x = 5.0f;
float y = 3.0f;
float z = 0.0f;
y = x + y;
z = x + y;
out[0] = z;
}
Stack PTX: stack-based instruction compiler
If you do not want to hand-write PTX, you can use Stack PTX to generate the stub:
from mm_ptx.stack_ptx import RegisterRegistry
from stack_ptx_default_types import Stack, PtxInstruction, compiler
# Setup naming associations
registry = RegisterRegistry()
registry.add(func["x"].reg, Stack.f32, name="x")
registry.add(func["y"].reg, Stack.f32, name="y")
registry.add(func["z"].reg, Stack.f32, name="z")
registry.freeze()
# Instructions to run
instructions = [
registry.x, # Push 'x'
registry.y, # Push 'y'
PtxInstruction.add_ftz_f32, # Pop 'x', Pop 'y', Push ('x' + 'y')
registry.x, # Push 'x'
PtxInstruction.add_ftz_f32 # Pop 'x', Pop ('x' + 'y'), Push ('x' + ('x' + 'y'))
]
# Create ptx stub
ptx_stub = compiler.compile(
registry=registry,
instructions=instructions,
requests=[registry.z],
...
)
# Inject the ptx stub in to the ptx inject site/s
final_ptx = inject.render_ptx({"func": ptx_stub})
Printing ptx_stub gives:
{
.reg .f32 %_a<2>;
add.ftz.f32 %_a0, %_x0, %_x2;
add.ftz.f32 %_a1, %_x2, %_a0;
mov.f32 %_x1, %_a1;
}
This would be equivalent to writing this CUDA kernel directly but without the CUDA to PTX compilation overhead:
extern "C"
__global__
void kernel(float* out) {
float x = 5.0f;
float y = 3.0f;
float z = 0.0f;
z = x + (x + y);
out[0] = z;
}
Stack PTX instruction descriptions
The instruction definitions are defined by the user and are not part of the core Stack PTX system. This allows customization of the described instructions to fit the users demands.
- Minimal example of PTX instruction and type definitions: examples/stack_ptx_default_types.py
- More extensive example: examples/stack_ptx_extended_types.py
Install
pip install mm-ptx
Requires Python 3.9+.
Tests
python -m pip install -e .
python -m unittest discover -s tests
CUDA integration tests are skipped by default. To run them (requires cuda.core, cuda.bindings, and a CUDA-capable GPU):
MM_PTX_RUN_CUDA_TESTS=1 python -m unittest discover -s tests
Examples
More details
For the C/C++ headers and deeper implementation notes, see the mm-ptx repo:
- https://github.com/MetaMachines/mm-ptx/blob/master/README.md
- https://github.com/MetaMachines/mm-ptx/blob/master/PTX_INJECT.md
- https://github.com/MetaMachines/mm-ptx/blob/master/STACK_PTX.md
License
MIT. See LICENSE.
Citation
If you use this software in your work, please cite it using the following BibTeX entry (generated from CITATION.cff):
@software{Durham_mm-ptx_2025,
author = {Durham, Charlie},
title = {mm-ptx: PTX Inject and Stack PTX for Python},
version = {1.0.1},
date-released = {2025-10-19},
url = {https://github.com/MetaMachines/mm-ptx-py}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mm_ptx-1.0.1.tar.gz.
File metadata
- Download URL: mm_ptx-1.0.1.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d924168a0b4b569aee8dae59b65b8bbcc1b2359fe3028cfd255cd6151680fbef
|
|
| MD5 |
b0243e68a87bcce91cbe38707f287477
|
|
| BLAKE2b-256 |
7e8c873010d1b65c4233c72737b4def136499ff39a6b33e5d12996df9a5eb953
|
File details
Details for the file mm_ptx-1.0.1-cp312-cp312-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: mm_ptx-1.0.1-cp312-cp312-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.12, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb77e169d73055d12d6b930200837e44390418a331dd5b609eab9382211e68e7
|
|
| MD5 |
8a9db6fce0e651514cf0802f70ca8ec4
|
|
| BLAKE2b-256 |
b97f579e09b66fccd724b467b00c2d436c6b2f1b71bf23d80c495e486932832c
|
File details
Details for the file mm_ptx-1.0.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: mm_ptx-1.0.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 983.7 kB
- Tags: CPython 3.12, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ac756e56b7a5ffb028d8da328b02d384e39eb939fc7c6e197ab0605618e6700
|
|
| MD5 |
c25af64bc68427fce3494f568afe674a
|
|
| BLAKE2b-256 |
6f9bd0c5391f69b428f80ff9fb9c5bb2f431abd13a2f045f218808d5f4a531da
|
File details
Details for the file mm_ptx-1.0.1-cp311-cp311-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: mm_ptx-1.0.1-cp311-cp311-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.11, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
140f028408f4ca03b1f263dbf062a0e8d6774d9c8419e4768e899f62f67fc433
|
|
| MD5 |
3509cd6b390fd0f732df418b29a045f0
|
|
| BLAKE2b-256 |
5e670fd9edb13cd397304b69814be45b36d2a8f5e2fef2c7b028ee0cbd75eb90
|
File details
Details for the file mm_ptx-1.0.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: mm_ptx-1.0.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 984.7 kB
- Tags: CPython 3.11, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ec7e7eedd8cf48b5ed9249589a2bdd2625e75bb80ec35b9ef0650f8aa829d7a
|
|
| MD5 |
7a9c2df5ab1d69fdc78c5f43e4413dca
|
|
| BLAKE2b-256 |
396ada6319ad1c28af0d06640a17f1030b7614193d3a34e2eda6dee297a42432
|
File details
Details for the file mm_ptx-1.0.1-cp310-cp310-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: mm_ptx-1.0.1-cp310-cp310-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.10, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a9a33b45666e675cdd8c58dd69a29fbcc8e2041f1c17002266dd529523186bc
|
|
| MD5 |
2f67999de28b763ab329ef543c227c81
|
|
| BLAKE2b-256 |
c90dd747875a1e25da74aa6641bad1b55d106cf23e55e51e71999f5eed7da76f
|
File details
Details for the file mm_ptx-1.0.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: mm_ptx-1.0.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 984.0 kB
- Tags: CPython 3.10, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8039d580b44fd9780613cdf128b85e8c80959520bff1328fd7dde6fe0fa6e400
|
|
| MD5 |
4f4859134cc016c17dd7a157c8e6acc8
|
|
| BLAKE2b-256 |
b2832fa813064140900a005fbff3bf1a1d24e058acfc23dd13f39b94c850fbbe
|
File details
Details for the file mm_ptx-1.0.1-cp39-cp39-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: mm_ptx-1.0.1-cp39-cp39-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.9, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88032a0c06c6f0dd095d5f31220d8439d1b54961c8389c630c6f924368e11d36
|
|
| MD5 |
5e9c61d0bbfadb4abb49b3bfc3f15aff
|
|
| BLAKE2b-256 |
8c39242ce180915513f09cc560c424f0174c8f9350d9a58ad5b4389553c53a61
|
File details
Details for the file mm_ptx-1.0.1-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: mm_ptx-1.0.1-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 984.3 kB
- Tags: CPython 3.9, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49ffd4e7d5ab4e901bf050f2ae2f64e6b3c98a10658fa96f053e03c32d7c65be
|
|
| MD5 |
5899c7a27acd17c79722bd694f627276
|
|
| BLAKE2b-256 |
1ecf3a72c921030b8e496326fce2bbd9757ee58f0025c291f759df19bfb920e1
|