Skip to main content

You like pytorch? You like micrograd? You love tinygrad! heart

Project description


Unit Tests

For something in between a pytorch and a karpathy/micrograd

This may not be the best deep learning framework, but it is a deep learning framework.

The sub 1000 line core of it is in tinygrad/

Due to its extreme simplicity, it aims to be the easiest framework to add new accelerators to, with support for both inference and training. Support the simple basic ops, and you get SOTA vision models/efficientnet.py and language models/transformer.py models.

We are working on support for the Apple Neural Engine and the Google TPU in the accel/ folder. Eventually, we will build custom hardware for tinygrad, and it will be blindingly fast. Now, it is slow.

Installation

git clone https://github.com/geohot/tinygrad.git
cd tinygrad
python3 setup.py develop

Contributing

There's a lot of interest in tinygrad lately. Here's some guidelines for contributing:

  • Bugfixes are the best and always welcome! Like this one.
  • If you don't understand the code you are changing, don't change it!
  • All code golf PRs will be closed, but conceptual cleanups are great.
  • Features are welcome. Though if you are adding a feature, you need to include tests.
  • Improving test coverage is great, with reliable non brittle tests.

Example

from tinygrad.tensor import Tensor

x = Tensor.eye(3, requires_grad=True)
y = Tensor([[2.0,0,-2.0]], requires_grad=True)
z = y.matmul(x).sum()
z.backward()

print(x.grad)  # dz/dx
print(y.grad)  # dz/dy

Same example in torch

import torch

x = torch.eye(3, requires_grad=True)
y = torch.tensor([[2.0,0,-2.0]], requires_grad=True)
z = y.matmul(x).sum()
z.backward()

print(x.grad)  # dz/dx
print(y.grad)  # dz/dy

Neural networks?

It turns out, a decent autograd tensor library is 90% of what you need for neural networks. Add an optimizer (SGD, RMSprop, and Adam implemented) from tinygrad.nn.optim, write some boilerplate minibatching code, and you have all you need.

Neural network example (from test/test_mnist.py)

from tinygrad.tensor import Tensor
import tinygrad.nn.optim as optim

class TinyBobNet:
  def __init__(self):
    self.l1 = Tensor.uniform(784, 128)
    self.l2 = Tensor.uniform(128, 10)

  def forward(self, x):
    return x.dot(self.l1).relu().dot(self.l2).logsoftmax()

model = TinyBobNet()
optim = optim.SGD([model.l1, model.l2], lr=0.001)

# ... and complete like pytorch, with (x,y) data

out = model.forward(x)
loss = out.mul(y).mean()
optim.zero_grad()
loss.backward()
optim.step()

GPU and Accelerator Support

tinygrad supports GPUs through PyOpenCL.

from tinygrad.tensor import Tensor
(Tensor.ones(4,4).gpu() + Tensor.ones(4,4).gpu()).cpu()

ANE Support?! (broken)

If all you want to do is ReLU, you are in luck! You can do very fast ReLU (at least 30 MEGAReLUs/sec confirmed)

Requires your Python to be signed with ane/lib/sign_python.sh to add the com.apple.ane.iokit-user-access entitlement, which also requires sudo nvram boot-args="amfi_get_out_of_my_way=1 ipc_control_port_options=0". Build the library with ane/lib/build.sh

In order to set boot-args and for the AMFI kext to respect that arg, run csrutil enable --without-kext --without-nvram in recovery mode.

from tinygrad.tensor import Tensor

a = Tensor([-2,-1,0,1,2]).ane()
b = a.relu()
print(b.cpu())

Warning: do not rely on the ANE port. It segfaults sometimes. So if you were doing something important with tinygrad and wanted to use the ANE, you might have a bad time.

hlops (in tensor.py)

hlops are syntactic sugar around mlops. They support most things torch does.

mlops

mlops are mid level ops, there's 15 of them. They understand memory allocation and derivatives

Relu, Log, Exp                          # unary ops
Sum, Max                                # reduce ops (with axis argument)
Add, Sub, Mul, Pow                      # binary ops (no broadcasting, use expand)
Reshape, Permute, Slice, Expand, Flip   # movement ops
Conv2D(NCHW)                            # processing op (Matmul is also Conv2D)

You no longer need to write mlops for a new accelerator

Adding an accelerator (llops)

The autodiff stuff is all in mlops now so you can focus on the raw operations

Buffer                                                     # class of memory on this device
unary_op  (RELU, EXP, LOG, NEG, SIGN)                      # A -> A
reduce_op (SUM, MAX)                                       # A -> B (smaller size, B has 1 in shape)
binary_op (ADD, SUB, MUL, DIV, POW, CMPEQ)                 # A + B -> C (all the same size)
movement_op (RESHAPE, PERMUTE, PAD, SHRINK, EXPAND, FLIP)  # A -> B (different size)
processing_op (CONV)                                       # A + B -> C

When tinygrad moves to lazy evaluation, optimizations will happen here.

ImageNet inference

Despite being tiny, tinygrad supports the full EfficientNet. Pass in a picture to discover what it is.

ipython3 examples/efficientnet.py https://media.istockphoto.com/photos/hen-picture-id831791190

Or, if you have a webcam and cv2 installed

ipython3 examples/efficientnet.py webcam

PROTIP: Set "GPU=1" environment variable if you want this to go faster.

PROPROTIP: Set "DEBUG=1" environment variable if you want to see why it's slow.

tinygrad supports Stable Diffusion!

Run TORCH=1 python3 examples/stable_diffusion.py

(or without torch: OPT=2 OPENCL=1 python3 examples/stable_diffusion.py)

"a horse sized cat eating a bagel"

tinygrad supports GANs

See examples/mnist_gan.py

tinygrad supports yolo

See examples/yolov3.py

The promise of small

tinygrad will always be below 1000 lines. If it isn't, we will revert commits until tinygrad becomes smaller.

Drawing Execution Graph

  • Nodes are Tensors
  • Black edge is a forward pass
  • Blue edge is a backward pass
  • Red edge is data the backward pass depends on
  • Purple edge is intermediates created in the forward
GRAPH=1 python3 test/test_mnist.py TestMNIST.test_sgd_onestep
# requires dot, outputs /tmp/net.svg

Running tests

python3 -m pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ccd3dedd9f0835903-0.5.0.tar.gz (51.8 kB view details)

Uploaded Source

Built Distribution

ccd3dedd9f0835903-0.5.0-py3-none-any.whl (43.8 kB view details)

Uploaded Python 3

File details

Details for the file ccd3dedd9f0835903-0.5.0.tar.gz.

File metadata

  • Download URL: ccd3dedd9f0835903-0.5.0.tar.gz
  • Upload date:
  • Size: 51.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for ccd3dedd9f0835903-0.5.0.tar.gz
Algorithm Hash digest
SHA256 ffeac6228ae0cef802e4f99c91c8fb8f0a1af7fe9610d50115e1961f90072e44
MD5 ac9e0df6cd3d11e08751f1069f88a482
BLAKE2b-256 a507757782958a09a922e64e82f50449b419c16a86c87ebc4614fb962ded8db0

See more details on using hashes here.

File details

Details for the file ccd3dedd9f0835903-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ccd3dedd9f0835903-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9b0a8fb46afa821720668a67e0515f743b9f0e8416f669c6b4be068aa06201cd
MD5 a1b9058ffdf763fc839e5e6d6de01da0
BLAKE2b-256 496f48ac126cc81e256583f45307c24b8068d1e07314e8b66bb4a0c965ba672d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page