slope AD

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

logo

Slope

Slope is a small automatic differentation (AD) engine, focused on machine learning (ML) Tensor semantics are similar to Pytorch, functional API is similar to JAX, tensor operators code is heavily derived from tinygrad.

Example:

import slope

def f(x):
    y = x * 2.0
    return y.sum()

x = slope.tensor([1.,2.,3.])
gf_x = slope.grad(f)(x)
print(f"{gf_x=}")

gf_x=<Tensor: val=
[2. 2. 2.]
shape=(3,), dtype=float32, device='cpu:0'>

Install

git clone https://github.com/radenmuaz/slope
cd slope
pip install -e .

Or you can just copy src/slope to your projects.

Features

Forward-mode, reverse-mode, and higher-order AD.
Just-in-time compilation, with interchangeable backends, supporting CPU, CUDA and Metal:
- ONNX Runtime (ONNX graph); this is the default backend
- IREE (StableHLO MLIR)
- NumPy (Python code)
Training and inference, examples:
Small (?)
- <3000 lines of core code slope/core.py, after run with black src --line-length 140
Operators and procedures system
- 33 core operators defined in slope/operators.py
  - Unary: exp log sin sqrt invert cast stop_gradient
  - Binary: add mul sub div pow equal less greater maximum
  - Reduce: sum max
  - Shape: reshape expand permute slice pad flip cat
  - Init: full arange random_normal random_uniform
  - GeneralReduce: matmul conv gather_nd scatter_nd
- Composite operators system with "procedures" slope/procedures.py
  - Procedures are functions containing calls to operators, exposed with Tensor.procedure_name(*args) syntax.
  - Useful for definitions like:
    - x.cos(), where def cos(x): return (math.pi/2 - x).sin()
    - x.conv_transpose(w): where def conv_transpose(x, w): ... is a very long function.
- An operator can be directly implemented as code translation to backend, or fallback to a procedure, e.g. there is conv procedure in case the backend has no implementation for it.
Extensible
- Add new backend by defining implementation translations slope/backends
- NN module slope/nn.py

Quickstart

There are many examples in examples/ folder.

We start by running MNIST classifier training, examples/nn/mnist_mlp.py

python examples/nn/mnist_mlp.py

Starting training...
...
Train epoch: 2, batch: 299/300, loss: 12.51: 100%|██████████████████████████████████████████████████████████████████████| 300/300 [00:02<00:00, 139.36it/s]
Epoch 2 in 2.15 sec
Test set accuracy 0.97

By setting the SLOPE_BACKEND flag, we change the backend to either iree (default), onnxruntime and numpy. We can also set LOG_JIT=1 to verbose print the backend output.

LOG_JIT=1 SLOPE_BACKEND=onnxruntime python examples/nn/mnist_mlp.py

...

---- train_step codegen:

<ir_version: 7, opset_import: ["" : 18, "slope":1]>
main (float[100, 784] x0, float[10, 100] x1, float[200, 28, 28] x2, float[200, 10] x3, int32[] x4, float[100, 784] x5, float[10, 100] x6, float[] x7) => (float[] y0, float[100, 784] y2, float[10, 100] y4, int32[] y5, float[100, 784] y1, float[10, 100] y3, float[] x7)
{
    
    z0_shape = Constant <value = int64[2] { 200, 784 } >()
    z0 = Reshape(x2, z0_shape)
...
    y4 = Sub(x1, z164)
    z165_fill_value = Constant < value = int32[1] { 1 }>()
    z165_squeeze_dim = Constant <value = int64[1] {0}> ()
    z165 = Squeeze (z165_fill_value, z165_squeeze_dim)
    
    y5 = Add(x4, z165)
}
...

===============

Train epoch: 0, batch: 58/300, loss: 71.23:  20%|██████████████▎                                                          | 59/300 [00:01<00:04, 55.45it/s]

Environment flags

put this before the command to set

# prints the jitted code
LOG_JIT=1 

# set device
DEFAULT_DEVICE=cpu
DEFAULT_DEVICE=cuda
DEFAULT_DEVICE=metal

# set backend
SLOPE_BACKEND=iree # iree backend (default)
SLOPE_BACKEND=onnxruntime # onnxruntime backend
SLOPE_BACKEND=numpy # numpy backend (extremely SLOW)

Slope internals tutorial

Slope has familiar Pytorch-like syntax

Most of the things familiar in Pytorch works in Slope, probably.

import slope
x = slope.ones(2, 5)
w = slope.arange(15, dtype=slope.float32).reshape(5,3)
b = slope.tensor([1., 2., 3.], dtype=slope.float32)
y = x @ w + b
print(y)

Every operations are compiled with slope.jit

Operation calls are jitted as individual programs eagerly. Try running this on terminal:

LOG_JIT=1 python -c "import slope; print(slope.ones(3)*2)"

...
---- full_shape__lp__rp__fill_value_2_dt_0_dtype_<DType:float32>_device_<Device:'cpu:0'>_ codegen:

func.func @main () -> (tensor<f32>)
{
    %y0 = "stablehlo.constant"() { value = dense<2.0> : tensor<f32> } : () -> (tensor<f32>)
    "func.return"(%y0): (tensor<f32>) -> ()
}
... # plenty of outputs
....

To prevent eager jit, write code in function and use slope.jit. Then call the function

@slope.jit
def f(x):
    y = x * x
    return y
# Alternative way to jit
# f = slope.jit(f)

x = slope.full((1,), 2.)
y = f(x)
print(y)

To see the actual code:

jit_object = f.lower(x)
# slope Program intermediate representation
print(jit_object.program)
# backend code
print(jit_object.code)

def f(x0): # [1, f32] -> [1, f32]
    y0 = slope.mul(x0, x0) # ([1, f32], [1, f32]) -> [1, f32]
    return y0

func.func @main (%x0: tensor<1xf32>) -> (tensor<1xf32>)
{
    %y0 = "stablehlo.multiply"(%x0, %x0) : (tensor<1xf32>,tensor<1xf32>) -> (tensor<1xf32>)
    "func.return"(%y0): (tensor<1xf32>) -> ()
}

Derivatives and gradients

Slope has several AD functions, like slope.jvp slope.vjp and slope.grad

To do the usual backprop things:

def f(x, w):
    y = x @ w
    return y
def loss_fn(x, w, y):
    y_hat = f(x,w)
    return ((y_hat - y)**2).sum()
gloss_fn = slope.value_and_grad(loss_fn, argnums=(1,))

@slope.jit
def train_step(x, w, y, lr):
    loss, gw = gloss_fn(x, w, y)
    w = w - lr * gw
    return loss, w

N = 50
x = slope.randn(N, 2)
y = slope.randn(N, 1)
w = slope.randn(2, 1)
lr = slope.tensor([0.001])
for i in range(10):
    loss, w = train_step(x, w, y, lr)
    print(i, loss.numpy())

0 102.412125
1 88.60157
2 78.322815
3 70.644066
4 64.883766
5 60.54294
6 57.25553
7 54.75257
8 52.83598
9 51.359528

Contributing

Fork this repo and hack, and maybe do a PR, too many things need to be done (see Roadmap) Idk everything is flaky and I am still experimenting and doing many API changes, maybe later I will open a new github repo.

Roadmap

Docs
Symbolic shape inference
Dynamic shape jit
Optimizer filter frozen params
vmap vjp and jvp to compute jacobian and hessian
iree backend currently has fixed seed random, implement threefry and JAX-like random
make things fast
llama (gpt) training
whisper inference
core tests, operators tests on all Trace types

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.1.2

Feb 18, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slope_ad-0.1.2.tar.gz (378.2 kB view hashes)

Uploaded Feb 18, 2024 Source

Built Distribution

slope_ad-0.1.2-py3-none-any.whl (98.0 kB view hashes)

Uploaded Feb 18, 2024 Python 3

Hashes for slope_ad-0.1.2.tar.gz

Hashes for slope_ad-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`32fcdab047ecdde7b0a9ce3f31844f5c686f6048b97bef9c66cd0353b9fd1475`
MD5	`87e8e0cb5560549246e82ab9840f8de7`
BLAKE2b-256	`182096bf6ec07ffd26d55e9d61c234ad36b7964a6c37533694e68581e3da2549`

Hashes for slope_ad-0.1.2-py3-none-any.whl

Hashes for slope_ad-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ec53fde20379c78a15cbb4080b4877c2ff58f3694d1166b5edb2796bf6473c6d`
MD5	`2c4a0663b4fb2c02cd8f5b819bdc80d8`
BLAKE2b-256	`52f7607b65e7b7a716c846c44df13fd75800f94920ec6d673f92fdc997e4247a`