Skip to main content

PyTorch compiler and WebGPU runtime

Project description

torch-webgpu

PyTorch compiler and WebGPU runtime, capable of running LLMs like LLama 3.2 3B

Installation

pip install torch-webgpu

Documentation

https://torch-webgpu.maczan.pl

Supported platforms

  • Linux (x86_64)
  • macOS (Apple Silicon)
  • Windows (x86_64)

Use

In Python:

from torch_webgpu import webgpu_backend

And now you can use @torch.compile(backend=webgpu_backend), device="webgpu", to="webgpu" to run and compile PyTorch on a real WebGPU!

FAQ

Why?

WebGPU promises to run everywhere - on almost every hardware - and becomes well supported in web browser. This project is a bridge between PyTorch world and WebGPU world

There is "web" in "WebGPU", so does it mean that I can run PyTorch in a browser now?

This is a step towards running PyTorch in a browser. The next step is to run PyTorch inside a browser. I am actively researching how to do it - if this topic excites you too, contact me on Twitter or open an Issue in this GitHub repo

How serious are you about this project? Is it a research or PoC in mind or are you going to make it production quality?

Once we hit version 1.0.0, torch-webgpu will be a production-ready PyTorch backend. WebGPU is an exciting, emerging technology. As of Nov 2025 all major browsers support WebGPU. I think that it's highly important to build a bridge between PyTorch and WebGPU.

Will you upstream WebGPU backend to PyTorch or keep it out-of-tree forever?

We'll see, ideally I'd see it as a part of PyTorch core, but we need to get a very high quality first to allow ourselves to ask PyTorch maintainers about it

Contributor policy

I have a very little time and need to be picky about contributions, so please make sure you contribute code that is:

  • well thought
  • covered with unit tests
  • you understand everything what you wrote
  • as concise as possible - I can't handle too big PRs, sorry!

Use LLM at your discretion, but provide exhaustive explanation of what you built and why. Write it by yourself to show that you really understand

I can understand if that sounds too picky, but since I build this project after hours, I need to cut any additional noise. Sorry and thanks for understanding!

I don't like X about this project

That's ok. The main goal here is to build a bridge (for community) and learn ML compilers in depth (for me). The project moves regularly, at its own pace. Things improve, cover more use cases, get more tests, get rethinked and rewrote. A journey, insights and learning over a raw development velocity. That's a tradeoff I choose

I wish you moved faster

You can fund the project to give me more spare time to work on it. My email: github@maczan.pl

Did AI built it?

The project started 26 Oct 2025. I have been coding it by hand and learning a lot about PyTorch internals and ML compilation in general. Once I made the project to the point where you could compile and run MLP on WebGPU, on 10 Jan 2026 I started to generate many missing ops using AI agents. In just 2 days, AI boosted the project from compiling and running MLPs to compiling and running LLMs ❤️

Open a GitHub issue if you have more questions. Thanks and let's build this bridge!

Ops support

Many of important ops are implemented. If any is missing, feel free to open a PR or an issue. Thanks!

Device / to

  • CPU <-> WebGPU
  • CUDA <-> WebGPU
  • MPS <-> WebGPU
  • Intel Gaudi <-> WebGPU
  • XLA <-> WebGPU

TODOs

  • performance wasn't a priority yet
  • only float32 supported
  • wgpu::Queue.Submit() handled synchronously
  • some ops fallback to CPU
  • add more compiler optimizations
  • get high performance without platform specific (CUDA, MPS, ROCm) kernels. Five ingredients should be enough to get there - PyTorch, Python, C++, WGSL shaders and WebGPU runtime. Currently, torch-webpgu uses Google Dawn
  • implement missing ops

Resources

Note: This project is unrelated to webgpu-torch, which is a neat PyTorch reimplementation in TypeScript targeting WebGPU

Dev resources

Build from source (only for development)

  1. Clone this repo
  2. Build Dawn: ./scripts/build-dawn.sh (or set DAWN_PREFIX to your Dawn installation)
  3. Build: ./build.sh

C++ unit tests

  1. Remember to rebuild your code before testing - ./build.sh
  2. chmod +x build-ctests.sh run-ctests.sh
  3. Update build-ctests.sh with your paths
  4. rm -rf build/ctests
  5. ./build-ctests.sh
  6. ./run-ctests.sh

C++ benchmarks

  1. Remember to rebuild your code before testing - ./build.sh and optionally log in to your wandb.ai account
  2. chmod +x build-benchmark.sh run-benchmark.sh
  3. Update build-benchmark.sh with your paths
  4. rm -rf build/benchmarks
  5. ./build-benchmark.sh
  6. ./run-benchmark.sh

Python unit tests

  1. Remember to rebuild your code before testing - ./build.sh
  2. pytest tests to run all tests. pytest tests/ops/test_cos.py to run a chosen test file, like here we test cosinus

Cite

If you use this software, please cite it as below.

@software{Maczan_torch-webgpu_2025,
author = {Maczan, Jędrzej Paweł},
month = oct,
title = {{torch-webgpu - PyTorch compiler and WebGPU runtime}},
url = {https://github.com/jmaczan/torch-webgpu},
version = {1.0.0},
year = {2025}
}

Credits

Jędrzej Maczan, 2025 - ∞

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

torch_webgpu-0.0.1-cp312-cp312-win_amd64.whl (11.0 MB view details)

Uploaded CPython 3.12Windows x86-64

torch_webgpu-0.0.1-cp312-cp312-manylinux_2_35_x86_64.whl (22.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.35+ x86-64

torch_webgpu-0.0.1-cp312-cp312-macosx_15_0_universal2.whl (3.3 MB view details)

Uploaded CPython 3.12macOS 15.0+ universal2 (ARM64, x86-64)

File details

Details for the file torch_webgpu-0.0.1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for torch_webgpu-0.0.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 0dba0a172c9071ead3f503a262e6215d9e600397ed55b5e5c9aac49dc16c9a58
MD5 b8ca5f0870070eca50d0dd7ec3796605
BLAKE2b-256 66988daea6f8ef50bf317d397df650a033781bbbc9450d62962e33e24ec20279

See more details on using hashes here.

Provenance

The following attestation bundles were made for torch_webgpu-0.0.1-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on jmaczan/torch-webgpu

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file torch_webgpu-0.0.1-cp312-cp312-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for torch_webgpu-0.0.1-cp312-cp312-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 824bf62a904a435b1d21d06331b47ab1e23a7819008a45414f1934bdc500d8d1
MD5 4c5982c69dd24ca1027b3ce2357d552c
BLAKE2b-256 9db2ec87d3ba122b44b87506d32dfd8ea64fb2714b1415a6858404928186dc51

See more details on using hashes here.

Provenance

The following attestation bundles were made for torch_webgpu-0.0.1-cp312-cp312-manylinux_2_35_x86_64.whl:

Publisher: publish.yml on jmaczan/torch-webgpu

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file torch_webgpu-0.0.1-cp312-cp312-macosx_15_0_universal2.whl.

File metadata

File hashes

Hashes for torch_webgpu-0.0.1-cp312-cp312-macosx_15_0_universal2.whl
Algorithm Hash digest
SHA256 78626b908f7589df5b030133373a229f34713b868d215c6435581cba9329adcd
MD5 d98deed3b30bd60d6832d4d548c878c1
BLAKE2b-256 82614538ae331ddfaddbb999862a7ee6e499374d2d759588e4710bc31f68d602

See more details on using hashes here.

Provenance

The following attestation bundles were made for torch_webgpu-0.0.1-cp312-cp312-macosx_15_0_universal2.whl:

Publisher: publish.yml on jmaczan/torch-webgpu

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page