Skip to main content

micrograd2023 was developed based on Andrej Karpathy micrograd with added documentations using nbdev for teachning purposes

Project description

micrograd2023

“What I cannot create, I do not understand.”

– Richard Feynman (1918-1988)

Citing this work

DOI

Do, H. P. (2026). micrograd2023: Automatic Differentiation Package for Education (1.0.0). Zenodo. https://doi.org/10.5281/zenodo.18526836

What is micrograd2023?

I developed micrograd2023 based on Andrej Karpathy’s micrograd implementation with added functionalities for teaching purposes using Jeremy Howard’s and Hamel Husain’s nbdev, which enables package development, testing, documentation, and dissemination all in one place.

Literate Programming

flowchart LR
  A(Andrej's micrograd) --> C((Combination))
  B(Jeremy's nbdev) --> C
  C -->|Literate Programming| D(micrograd2023)

Overview of micrograd2023

  • A detailed demonstration of micrograd2023 for training and integrating MLP can be found in this MLP DEMO.

  • A demonstration of micrograd2023 for Physics for auto-differentiation of a popular cosine function can be found in this Physics Cosine DEMO.

    • Comparing the micrograd2023 results with the analytical solutions, pytorch’s autograd, and jax’s autograd.
    • Additionally, second-order derivatives are calculated using jax’s autograd.
    • it is possible to use jax’s autograd to calculate higher-order derivatives.
  • A demonstration of micrograd2023 for Physics for auto-differentiation of a popular exponential decay function can be found in this Physics Exp. DEMO.

  • A demonstration of micrograd2023 for Physics for auto-differentiation of a damping function can be found in this Physics Damp DEMO.

  • A demonstration of micrograd2023 for MRI for auto-differentiation of a T2* decay model of data acquired from a multi-echo UTE sequence. Additionally, the auto-differentiations then be used to calculate the Fisher Information Matrix (FIM), which then allows calculations of Cramer-Rao Lower Bound (CRLB) of an un-bias estimator of T2*. Details can be seen at MRI T2* Decay DEMO.

  • A demonstration of micrograd2023 for MRI for auto-differentiation of a T1 recovery model of data acquired from a myocardial MOLLI T1 mapping sequence. Additionally, the auto-differentiations then be used to calculate the Fisher Information Matrix (FIM), which then allows calculations of Cramer-Rao Lower Bound (CRLB) of an un-bias estimator of T1. Details can be seen at MRI T1 Recovery DEMO.

Added features

Compared to Andrej’s micrograd, micrograd2023 has many extensions such as:

  • Adding more and extensive unit and integration tests.

  • Adding more methods for Value object such as tanh(), exp(), and log(). In principle, any method/function with known derivative or can be broken into primitive operations can be added to the Value object. Examples are sin(), sigmoid(), cos(), etc., which I left as exercises 😄.

  • Refactoring Andrej’s demo code make it easier to demonstrate many fundamental concepts and/or best engineering practices when training neural network. The concepts/best-practices are listed below. Some concepts were demonstrated while the rest are left as exercises 😄.

    • Always implemented a simplest and most intuitive solution as a baseline to compare with whatever fancy implementations we want to achieve

    • Data preparation - train, validation, and test sets are disjointed

    • Over-fitting

    • Gradient Descent vs. Stochastic Gradient Descent (SGD)

    • Develop and experiment with different optimizations i.e. SGD, SGD with momentum, rmsProp, Adam, etc.

    • SGD with momentum

    • Non-Optimal learning rate

    • How to find the optimal learning rate

    • Learning rate decay and learning rate schedule

    • Role of nonlinearity

    • Linear separable and non-separable data

    • Out of distribution shift

    • Under-fitting

    • The importance and trade-off between width and depth of the MLP

    • Over-fitting a single-batch

    • Hyperparameter tuning and optimizing

    • Weights initialization

    • Inspect and visualize statistics of weights, gradients, gradient to data ratios, and update to data ratios

    • Forward and backward dynamics of shallow and deep linear and non-linear Neural Network

    • etc.

Related Projects

Below are a few of my projects related to optimization and Deep Learning:

  • In 2019, I led the clinical validation of Canon Medical Systems’s Deep Learning-based denoising reconstruction technology. It was the first fully-integrated Deep Learning-based reconstruction technology to receive FDA 510(k) clearance to be used in clinical environment in 2020. See more details in this DLR-MRI project

  • Diploma Research on Crystal Structure using Gradient-based Optimization SLIDES

  • Deep Convolution Neural Network (DCNN) for MRI image segmentation with uncertainty quantification and controllable tradeoff between False Positive and False Negative. Journal Paper PDF; Conference Talk SLIDES; Conference Talk VIDEO

  • Deep Learning-based Denoising for quantitative MRI. Conference Talk SLIDES

How to install

Simply run the following command in your terminal to install micrograd2023:

pip install micrograd2023

Examples of using micrograd2023

Below are some examples of using micrograd2023.

Automatic Differentiation

# import necessary objects and functions
from micrograd2023.engine import Value
from micrograd2023.nn import Neuron, Layer, MLP
from micrograd2023.utils import draw_dot
import random
# inputs xs, weights ws, and bias b
w1 = Value(1.1)
x1 = Value(0.5)
w2 = Value(0.12)
x2 = Value(1.7)
b = Value(0.34)

# pre-activation
s = w1*x1 + x2*w2 + b

# activation
y = s.tanh()

# automatic differentiation
y.backward()

# show the computation graph of the perceptron
draw_dot(y)

# added random seed for reproducibility
random.seed(1234)
n = Neuron(3)
x = [Value(0.15), Value(-0.21), Value(-0.91) ]
y = n(x)
y.backward()
draw_dot(y)

Multilayer Perceptron (MLP) Training

You can use micrograd2023 to train a MLP and learn fundamental concepts such as overfilling, optimal learning rate, etc.

Good training

Overfitting

Running unit tests

To perform unit testing, using terminal to navigate to the directory, which contains tests folder, then simply type python -m pytest on the terminal. Note that, PyTorch is needed for the test to run since derivatives calculated using micrograd2023 are compared against those calculated using PyTorch as references.

python -m pytest

References

micrograd2023, an automatic differentiation software, was developed based on Andrej Karpathy’s micrograd, and was written using Jeremy Howard’s and Hamel Husain’s nbdev.

Citing this work

DOI

Do, H. P. (2026). micrograd2023: Automatic Differentiation Package for Education (1.0.0). Zenodo. https://doi.org/10.5281/zenodo.18526836

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

micrograd2023-1.0.1.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

micrograd2023-1.0.1-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file micrograd2023-1.0.1.tar.gz.

File metadata

  • Download URL: micrograd2023-1.0.1.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for micrograd2023-1.0.1.tar.gz
Algorithm Hash digest
SHA256 42aab3b139387172587d74c61009d9a337a177659de88379efd4b2a5f8a93af9
MD5 6938f2dd065e8e98336e29ef733c2bf2
BLAKE2b-256 7231b68dafd805abbcb1d644d9f329f9e9e1155faf6e23b7e5d630ae7ffa7fd7

See more details on using hashes here.

File details

Details for the file micrograd2023-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: micrograd2023-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for micrograd2023-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c2eee5b80c9ebdd6f57ee6c105eb0e14edf25a8dd8f2a1e4c30f40bc474d89b4
MD5 1391d414ab035a33202b8500857b1ea5
BLAKE2b-256 cda66de7cfaca1227321ef9d5411d64f3bda71ac562ef307b6e81d4fdf97b324

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page