PyOpenCL-based deep learning playground with autograd, kernels, and high-level APIs.
Project description
netcl - PyOpenCL Deep Learning Playground
netcl is an experimental PyOpenCL-based deep learning framework. It combines low-level kernels (conv/matmul/elementwise) with a lightweight autograd engine and a compact high-level API (layers, trainer, optimizers).
Installation
pip install netcl
Requirements: Python >= 3.10, NumPy, PyOpenCL, and an available OpenCL device (GPU or CPU).
Quick Start (toy MLP, high-level)
import numpy as np
from netcl.core.device import manager
from netcl.nn.layers import Sequential, Flatten, Linear, ReLU
from netcl.nn import functional
from netcl.optim import Adam
from netcl.trainer import Trainer
from netcl.data.dataloader import DataLoader
queue = manager.default(device="gpu").queue # or device="cpu"
model = Sequential(
Flatten(),
Linear(queue, 28 * 28, 128),
ReLU(),
Linear(queue, 128, 10),
)
opt = Adam(model.parameters(), lr=1e-3)
trainer = Trainer(model, opt, device_queue=queue)
# Toy data
x = np.random.randn(256, 1, 28, 28).astype(np.float32)
y = np.random.randint(0, 10, size=(256,))
loader = DataLoader(list(zip(x, y)), batch_size=32, shuffle=True, device_queue=queue)
trainer.fit(loader, epochs=1, loss_fn=functional.cross_entropy)
Small CNN (layers + BatchNorm)
import numpy as np
from netcl.core.device import manager
from netcl.nn import build_sequential, fast_bn_cnn_config, functional
from netcl.optim import AdamW
from netcl.trainer import Trainer
from netcl.data.dataloader import DataLoader
queue = manager.default(device="gpu").queue
model = build_sequential(queue, fast_bn_cnn_config(in_ch=3, num_classes=10))
opt = AdamW(model.parameters(), lr=1e-3, weight_decay=5e-4)
trainer = Trainer(model, opt, device_queue=queue)
x = np.random.randn(512, 3, 32, 32).astype(np.float32)
y = np.random.randint(0, 10, size=(512,))
loader = DataLoader(list(zip(x, y)), batch_size=64, shuffle=True, device_queue=queue)
trainer.fit(loader, epochs=1, loss_fn=functional.cross_entropy)
Declarative Model Config
from netcl.core.device import manager
from netcl.nn import build_sequential
queue = manager.default(device="gpu").queue
config = [
{"type": "Conv2d", "args": {"in_channels": 3, "out_channels": 16, "kernel_size": 3, "stride": 1, "pad": 1}},
{"type": "ReLU"},
{"type": "MaxPool2d", "args": {"kernel_size": 2, "stride": 2}},
{"type": "Flatten"},
{"type": "Linear", "args": {"in_features": 16 * 16 * 16, "out_features": 10}},
]
model = build_sequential(queue, config)
Autograd (explicit Tape)
import numpy as np
from netcl import autograd as ag
from netcl.core.device import manager
from netcl.core.tensor import Tensor
queue = manager.default(device="gpu").queue
x = Tensor.from_host(queue, np.random.randn(8, 4).astype(np.float32))
w = Tensor.from_host(queue, np.random.randn(4, 3).astype(np.float32))
x_node = ag.tensor(x, requires_grad=True)
w_node = ag.tensor(w, requires_grad=True)
# y = relu(x @ w)
logits = ag.relu(ag.matmul_op(x_node, w_node))
target = ag.tensor(Tensor.from_host(queue, np.zeros((8, 3), dtype=np.float32)))
loss = ag.mse_loss(logits, target)
# backward
ag.Tape().backward(loss)
DataLoader + CPU Transforms
import numpy as np
from netcl.core.device import manager
from netcl.data.dataloader import DataLoader
from netcl.data.filters import to_float, normalize
queue = manager.default(device="gpu").queue
x = np.random.randint(0, 255, size=(128, 3, 32, 32), dtype=np.uint8)
y = np.random.randint(0, 10, size=(128,))
transforms = [
to_float(scale=255.0),
normalize(mean=(0.5, 0.5, 0.5), std=(0.25, 0.25, 0.25)),
]
loader = DataLoader(list(zip(x, y)), batch_size=32, shuffle=True, transforms=transforms, device_queue=queue)
Save & Load (Sequential)
Serialization supports Sequential with: Conv2d, Linear, ReLU, LeakyReLU, Sigmoid, Tanh, Dropout, MaxPool2d, Flatten.
from netcl.io.serialization import save_model, load_model
save_model(model, "model_artifact")
model2 = load_model("model_artifact")
Key Features
- Autograd: Tape-based with backward for matmul, conv2d, pooling, elementwise, norms, and common losses.
- Layers:
Linear,Conv2d,BatchNorm2d,Sequential,build_sequentialconfigs. - Optimizers:
SGD,Momentum,Adam,AdamW,RMSProp, schedulers, andAMPGradScaler. - Data:
DataLoaderwith prefetch, async device transfer, and CPU transforms. - Ops: Matmul, conv2d, depthwise/transpose conv, softmax, reductions, padding.
- Serialization:
save_model/load_modelforSequentialmodels.
Notes
- Mixed precision is experimental; it is disabled on the CPU backend.
- Conv2d algorithms can be tuned via env flags like
NETCL_CONV_AUTOTUNE=1.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file netcl-0.1.2.tar.gz.
File metadata
- Download URL: netcl-0.1.2.tar.gz
- Upload date:
- Size: 108.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab12b6e918eaab0d2d185a98b4e2133f13003adee72ff0ea89c80e7732dce206
|
|
| MD5 |
5d13479e814bc9dcf00b9f48f94da37c
|
|
| BLAKE2b-256 |
81f68f2c3885fa3f6dba64ee3b9ff08c3509645af64c6f6c4c61fd7983934e3e
|
File details
Details for the file netcl-0.1.2-py3-none-any.whl.
File metadata
- Download URL: netcl-0.1.2-py3-none-any.whl
- Upload date:
- Size: 140.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b637560a5b3e6ea6b0073386b8b7f24452f75a177d7bb49468544d804af3049
|
|
| MD5 |
de8d24cdecb2ac3f56c60b3eac1d5ad3
|
|
| BLAKE2b-256 |
6c05702ee6dd40db3734fb7a746c9df7b4267bf765a9f34503a0311329aa702d
|