Skip to main content

No project description provided

Project description

gustavgrad

Tests codecov PyPI Python Version Code style: black

An autograd library built on NumPy, inspired by Joel Grus's livecoding.

Installation

With pip

pip install gustavgrad

How to use the library

Tensor operations

The Tensor class is the cornerstone of gustavgrad. It behaves much like an ordinary numpy.ndarray.

Tensors can be added together,

>>> from gustavgrad import Tensor
>>> x = Tensor([1, 2, 3])
>>> x + x
Tensor(data=[2 4 6], requires_grad=False)

... subracted from each other,

>>> x - x
Tensor(data=[0 0 0], requires_grad=False)

... their dot-product can calculated,

>>> x * x
Tensor(data=[1 4 9], requires_grad=False)

... and they can be multiplied with each other.

>>> y = Tensor([[1], [2], [3]])
>>> x @ y
Tensor(data=[14], requires_grad=False)

Tensor operations also support broadcasting:

>>> x * 3
Tensor(data=[3 6 9], requires_grad=False)
>>> z = Tensor([[1, 2, 3], [4, 5, 6]])
>>> x * z
Tensor(data=
[[ 1  4  9]
 [ 4 10 18]], requires_grad=False)

Automatic backpropagation

But a Tensor is not just an ndarray, they also keep track of their own gradient.

>>> speed = Tensor(1, requires_grad=True)
>>> time = Tensor(10, requires_grad=True)
>>> distance = speed * time
>>> distance
Tensor(data=10, requires_grad=True)

If a Tensor is created as the result of a Tensor operation involving a Tensor with requires_grad=True, the resulting Tensor will be able to backpropagate it's own gradient to it's ancestor.

>>> distance.backward()
>>> speed.grad
array(10.)
>>> time.grad
array(1.)

By calling the backward method on distance the gradient of speed and time is automatically updated. We can see that increasing speed by 1 would result in an increase in distance by 10, while an increase in time by 1 would only increase distance by 1.

The Tensor class supports backpropagation over arbitrary compositions of Tensor operations.

>>> t1 = Tensor([[1, 2, 3], [4, 5, 6]], requires_grad=True)
>>> t2 = Tensor([[1], [2], [3]])
>>> t3 = t1 @ t2 + 1
>>> t4 = t3 * 7
>>> t5 = t4.sum()
>>> t5.backward()
>>> t1.grad
array([[ 7., 14., 21.],
       [ 7., 14., 21.]])

The Module API

gustavgrad provides some tools to simplify setting up and training machine learning models. The Module API makes it easier to manage multiple related Tensors by registering them as Parameters. A Parameter is just a randomly initialized Tensor.

from gustavgrad.module import Module, Parameter
from gustavgrad.function import tanh
class MultilayerPerceptron(Module):
    def __init__(self, input_size: int, output_size: int, hidden_size: int = 100) -> None:
        self.layer1 = Parameter(input_size, hidden_size)
        self.bias1 = Parameter(hidden_size)
        self.layer2 = Parameter(hidden_size, output_size)
        self.bias2 = Parameter(output_size)
        
    def predict(self, x: Tensor) -> Tensor:
        x = x @ self.layer1 + self.bias1
        x = tanh(x)
        x = x @ self.layer2 + self.bias2
        return x

By subclassing Module our MultilayerPerceptron class automatically gets some helper methods for managing its Parameters. Let's create a MultilayerPerceptron that tries to learn the XOR function.

xor_input = Tensor([[0, 0], [0, 1], [1, 0], [1, 1]])
xor_targets = Tensor([[0], [1], [1], [0]])
xor_mlp = MultilayerPerceptron(input_size=2, output_size=1, hidden_size=4)

We can use the model to make predictions on the xor_input Tensor.

>>> predictions = xor_mlp.predict(xor_input)
>>> predictions
Tensor(data=
[[-1.79888385]
 [-1.07965756]
 [ 0.34373135]
 [ 1.63366069]], requires_grad=True)

The predictions of the randomly initialized model aren't right, but we can improve the model by calculating the gradient of it's Parameters in respect to a loss function.

from gustavgrad.loss import SquaredErrorLoss
se_loss = SquaredErrorLoss()
loss = se_loss.loss(xor_targets, predictions)
loss.backward()

loss is a Tensor, so we can call its backward method to do backpropagation through our xor_mlp. We can then adjust the weights of all Parameters in xor_mlp using gradient descent:

from gustavgrad.optim import SGD
optim = SGD(lr=0.01)
optim.step(xor_mlp)

After updating the weights we can reset the gradients of all parameters and make new predictions:

>>> xor_mlp.zero_grad()
>>> predictions = xor_mlp.predict(xor_input)
>>> predictions
Tensor(data=
[[-1.51682686]
 [-0.78583272]
 [ 0.55994602]
 [ 1.67962174]], requires_grad=True)

See examples/xor.py for a full example of how gustavgrad can be used to learn the XOR function. The examples directory also contains some other basic examples of how the library can be used.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gustavgrad-0.2.0.tar.gz (10.1 kB view hashes)

Uploaded Source

Built Distribution

gustavgrad-0.2.0-py3-none-any.whl (9.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page