Skip to main content

A compact machine learning runtime for developers who want to understand, control, and optimize the full stack.

Project description

Stargazers Forks Issues GitHub Actions Workflow Status


Magnetron Logo

magnetron

A compact machine learning runtime for developers who want to understand, control, and optimize the full stack.
Native C core, modern Python API, no runtime dependencies, no bloat.

Documentation »

Qwen3 Inference Example · Autoencoder Training Example · GPT-2 Inference Example


About

Magnetron is a machine learning runtime built from scratch in C, with a small modern Python interface for usability.
It implements its own tensor system, operator set, autograd engine, and execution model - without relying on large external frameworks.

The goal is simple:

Keep the stack small enough to understand and be hackable, but powerful enough to run real models.

This makes Magnetron useful in two situations:

  • when you want full control over execution and memory
  • when you want a clean base for experimentation or new ideas

Why Magnetron?

Magnetron is not trying to compete with PyTorch on ecosystem or feature count.

Instead, it optimizes for a different axis:

Magnetron PyTorch
Small, inspectable core Large, layered system
Explicit execution Implicit / abstracted
Minimal dependencies Heavy runtime
Easy to modify kernels Harder to reason about backend
Good for research & systems work Good for production & scale

If you want to:

  • understand how your model actually runs
  • experiment with kernels, memory layouts, or execution
  • port ML workloads to unusual hardware

Magnetron gives you a much shorter path.


Architecture Overview

Magnetron is built as a single, cohesive runtime, not a collection of loosely coupled libraries.

  • Tensor system
    Owns dtype, shape, strides, and memory – supports a full view system with a view solver, enabling complex slicing, reshaping, and broadcasting semantics similar to PyTorch while remaining explicit and predictable.

  • Execution model
    Eager execution with a dynamic autograd graph (reverse-mode), constructed per forward pass and traversed during backward.

  • Operator backend
    Central dispatch layer mapping high-level operations to architecture-specific kernel implementations.

  • CPU backend
    Multi-dispatch design with compile-time optimized kernels for a wide range of microarchitectures (Intel, AMD Zen1–Zen5, ARM).
    At runtime, CPUID-based detection selects the most optimal kernel path automatically.
    Supports multiple SIMD ISAs and extensions, including SSE (1–4), AVX, AVX2, FMA, AVX-512, AVX-512-BF16, AVX-512-FP16, F16C and ARM NEON, combined with multithreaded execution.

  • CUDA backend (in progress)
    Kernel layer is implemented - Memory management, execution pipeline, and integration are actively being completed.

  • Serialization
    Native .mag format designed for zero-copy, memory-mapped loading, enabling fast startup and efficient large model handling.
    Conversion tools are provided to import weights from external formats.

  • Backend extensibility
    The architecture is intentionally clean and modular, making it straightforward to introduce new backends or target additional hardware platforms.

The system is intentionally kept tight and explicit, so each layer is understandable, controllable, and replaceable without hidden complexity.


Highlights

  • Practical, not just educational
    Capable of running modern LLM inference (e.g. Qwen3 in BF16), not just toy models.

  • Small, controllable ML runtime
    Designed to stay inspectable end-to-end — no hidden execution layers or opaque backends.

  • True ownership of execution
    You can reason about memory layout, kernel dispatch, and graph behavior without abstraction barriers.

  • Hardware-aware by design
    Not a generic backend wrapper — kernels and execution are written with specific ISAs and microarchitectures in mind.

  • Zero-copy model loading
    Memory-mapped .mag format enables fast startup and efficient handling of large models.

  • Built for experimentation
    Easy to modify operators, add kernels, or prototype new execution strategies.

  • Minimal runtime surface
    Native extension with no required Python dependencies — easy to deploy and embed.


Example Models

End-to-end demos live under examples/.

Path Description
examples/qwen3/ Qwen3 transformer inference in bfloat16 with tokenizer integration, .mag weights, CLI chat, and HTTP/streaming API.
examples/gpt2/ GPT-2 causal language model inference with KV cache, token streaming, and configurable generation.
examples/ae/ Convolutional autoencoder with training loop and reconstruction visualization.
examples/linear_regression/ Simple 1D regression with SGD and loss tracking.
examples/xor/ Minimal MLP demonstrating autograd and optimization.

Operator Cheat Sheet

Magnetron provides a compact but expressive operator set covering:

  • elementwise operations (add, mul, div, ...)
  • reductions (sum, mean, ...)
  • tensor transformations (view, reshape, permute, ...)
  • neural building blocks (matmul, softmax, layernorm, ...)
  • type casting and memory views

A full reference of operators, data types, and semantics is available here:

Magnetron Cheat Sheet


Installation

Magnetron is available on PyPI.

Make sure you are inside a Python virtual environment.

pip install magnetron

or with uv:

uv pip install magnetron

Local Development

Clone the repository and install locally:

git clone --recursive https://github.com/MarioSieg/magnetron
cd magnetron
uv pip install . -v

For C/C++ development, open the project root (containing CMakeLists.txt) in an IDE such as CLion.


Quick start

from magnetron import Tensor, nn, optim

x = Tensor([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]])
y = Tensor([[0.0], [1.0], [1.0], [0.0]])

model = nn.Sequential(
    nn.Linear(2, 2),
    nn.Tanh(),
    nn.Linear(2, 1),
    nn.Tanh(),
)

optimizer = optim.SGD(model.parameters(), lr=1e-1)
criterion = nn.MSELoss()

for epoch in range(2000):
    y_hat = model(x)
    loss = criterion(y_hat, y)
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

    if epoch % 100 == 0:
        print(f"Epoch {epoch:4d} | Loss {loss.item():.6f}")

y_hat = model(x)
for i in range(x.shape[0]):
    print(f"Expected: {y[i].item():.1f}, Predicted: {y_hat[i].item():.4f}")

Roadmap

  • 🚧 CUDA backend
    Finish memory model, execution pipeline, and stabilize for production use.

  • 🚧 Multi-GPU execution
    Introduce scalable execution across multiple devices.

  • 🚧 New CPU architectures
    Support for LoongArch and RISC-V.

  • 🧪 JIT compilation
    Custom SSA-based IR with register allocation and target-specific instruction emission.


History

Magnetron started in 2024 as a personal project to understand how machine learning frameworks work internally: tensor storage, operator dispatch, autograd, and inference execution. What began as a learning project gradually evolved into a full runtime with its own tensor engine, native snapshot format, SIMD-specialized CPU backend, and support for running modern models such as Qwen3 in BF16. Today, Magnetron is developed both as a practical inference/runtime system and as a research platform for experimenting with new backends, execution strategies, and low-level ML systems ideas.


License

(c) 2026 Mario Sieg - mario.sieg.64@gmail.com
Distributed under the Apache 2 License.
Developed in Berlin, Germany.


Similar Projects

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

magnetron-0.1.6.tar.gz (7.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

magnetron-0.1.6-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

magnetron-0.1.6-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

magnetron-0.1.6-cp313-cp313-macosx_11_0_arm64.whl (929.5 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

magnetron-0.1.6-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

magnetron-0.1.6-cp312-cp312-macosx_11_0_arm64.whl (929.6 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

magnetron-0.1.6-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

magnetron-0.1.6-cp311-cp311-macosx_11_0_arm64.whl (929.3 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

magnetron-0.1.6-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

magnetron-0.1.6-cp310-cp310-macosx_11_0_arm64.whl (929.5 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file magnetron-0.1.6.tar.gz.

File metadata

  • Download URL: magnetron-0.1.6.tar.gz
  • Upload date:
  • Size: 7.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.13

File hashes

Hashes for magnetron-0.1.6.tar.gz
Algorithm Hash digest
SHA256 9643469963cb8be33f85731e6ae21e39a7beb6453b4502e9ae8e39195dae81c8
MD5 39cf1a6ef4df66f49acdacd98bd0c6fd
BLAKE2b-256 27b9b5f1538d0093bbb663b99d9b123b00c42f3df6caf3f1ca009ca54a0a30ba

See more details on using hashes here.

File details

Details for the file magnetron-0.1.6-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for magnetron-0.1.6-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9aac99c45d0e59591a6bfecd3d69525c89bf049d98707328ba9376b18aefca41
MD5 14c76c7b1c0fb20d2590e75c49cf0ba6
BLAKE2b-256 d3bef13b052d851c3b88229cf0d640cf9cc8681cc847adc63dad543c7697bd14

See more details on using hashes here.

File details

Details for the file magnetron-0.1.6-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for magnetron-0.1.6-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f4457a4a8bd1c4f3cec4158d9f68c01f901fb38ec3102d4b8f3a119ab6059d5b
MD5 55db677541ab169728c4c97de153a56a
BLAKE2b-256 31cd0e0a7217a4a870a8b99c3435c1397b4190ba45175733edd7cb02a2b3e940

See more details on using hashes here.

File details

Details for the file magnetron-0.1.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for magnetron-0.1.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a66aedc0611d2ee77d32cdb33d61179aca6cce1d8798702af48bcdd0feb7d3ae
MD5 9c5b22a9289c268b3b6306252a3e9d80
BLAKE2b-256 2b23f8f4a83e9ee0165244d42065b6f7d71d4733a424057ced4edcc6ee7d49e9

See more details on using hashes here.

File details

Details for the file magnetron-0.1.6-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for magnetron-0.1.6-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 cd26b4191e99fd2acd6bbe2ddb10b81d84ef2af350efe33bad898066b75752e0
MD5 cfcfc9c797cb01f6cc9763e8388a9ba5
BLAKE2b-256 bfe7cdeb4091ba15ce24b485c7881779b9c6d2b8a8a6f56ab8db2be6b027de21

See more details on using hashes here.

File details

Details for the file magnetron-0.1.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for magnetron-0.1.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a628dd153eda1aadceabf0f01925cda857e44321096e1c49b20d51c08d363636
MD5 dc76a71dba8f392b022a632aaad115ff
BLAKE2b-256 3c473826f68f6816ada36b8a73c4dd783b50a7cacdd18a54802d195389494a19

See more details on using hashes here.

File details

Details for the file magnetron-0.1.6-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for magnetron-0.1.6-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1b3508873f6fae3faf2edac555a72a5228a00595806ef920a6fa44deabbf96f7
MD5 62b2675d1a91b5737252908733609b5a
BLAKE2b-256 221466616be3142bcfaf7f03904f293bf35908697e2248b978d01db2ac4d2e66

See more details on using hashes here.

File details

Details for the file magnetron-0.1.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for magnetron-0.1.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 52a5bf7174776ad905d627073169913b6ad75bf6132ed210eba2b2c4ca99c010
MD5 b2207b79e8d17d9a8fc1150a9945891b
BLAKE2b-256 6762d1c861ddcf7e57318c054fe0bc74864c748d63822addb81a9ac95a1be9d5

See more details on using hashes here.

File details

Details for the file magnetron-0.1.6-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for magnetron-0.1.6-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2f873516f194c24e60ec15ca43b8dd3274c3a14b5ed068fbfb12193339a0c60c
MD5 b858440752a6a1f60d0cecdd4d7da1e7
BLAKE2b-256 06d23a3c9db43c93a432f7ea4d8ada82c3233a276e21c5b684044365644b3b50

See more details on using hashes here.

File details

Details for the file magnetron-0.1.6-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for magnetron-0.1.6-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 56076cdbdf4400f6cff8f5ef0fd506a0ea9cad05ec9efcde68dde87c930bebdc
MD5 c601f812e48fe9e417117b9f21f2e167
BLAKE2b-256 cd2f2192073e9c8291be7a95b2652b1ddcab60efda7f89e6768cca9b4b96218c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page