A tiny autograd engine and neural network library built from first principles
Project description
🔥 smoltorch
A tiny autograd engine and neural network library built from first principles
Inspired by Andrej Karpathy's micrograd, built for learning
🎯 What is smoltorch?
smoltorch is a minimalist deep learning library that implements automatic differentiation (autograd) and neural networks from scratch using only NumPy. It's designed to be:
- Educational: Understand how modern deep learning frameworks work under the hood
- Transparent: Every operation is visible and understandable
- Functional: Train real models on real datasets with competitive performance
- Minimal: ~500 lines of readable, well-documented Python code
Why "smoltorch"?
"Smol" + PyTorch. It's a tiny implementation that captures the essence of modern deep learning frameworks.
✨ Features
Core Engine
- ✅ Automatic differentiation with dynamic computational graphs
- ✅ NumPy-backed tensors for efficient numerical computing
- ✅ Broadcasting support with proper gradient handling
- ✅ Topological sorting for correct backpropagation
Operations
- Arithmetic:
+,-,*,/,** - Matrix operations:
@(matmul) - Activations: ReLU, tanh, sigmoid
- Reductions: sum, mean
- Element-wise: log
Neural Networks
- Layers: Linear (fully connected)
- Models: Multi-layer perceptron (MLP)
- Loss functions: MSE, Binary Cross-Entropy
- Optimizers: SGD (Stochastic Gradient Descent)
📦 Installation
From PyPI (recommended)
uv add smoltorch
From source
git clone https://github.com/kashifulhaque/smoltorch.git
cd smoltorch
uv pip install -e .
Development installation
uv pip install -e ".[dev]"
🚀 Quick Start
Basic Tensor Operations
from smoltorch import Tensor
# Create tensors
x = Tensor([1.0, 2.0, 3.0])
y = Tensor([4.0, 5.0, 6.0])
# Operations
z = x + y # Element-wise addition
w = x * y # Element-wise multiplication
a = x @ y.T # Matrix multiplication
# Backward pass
a.backward()
print(x.grad) # Gradients computed automatically!
Training a Neural Network (Regression)
from smoltorch import Tensor, MLP, SGD
from sklearn.datasets import make_regression
import numpy as np
# Generate data
X, y = make_regression(n_samples=100, n_features=5, noise=10)
y = y.reshape(-1, 1)
# Create model
model = MLP([5, 16, 16, 1]) # 5 inputs -> 16 -> 16 -> 1 output
optimizer = SGD(model.parameters(), lr=0.001)
# Training loop
for epoch in range(100):
# Forward pass
X_tensor = Tensor(X)
y_tensor = Tensor(y)
y_pred = model(X_tensor)
# Compute loss (MSE)
loss = ((y_pred - y_tensor) ** 2).mean()
# Backward pass
optimizer.zero_grad()
loss.backward()
# Update weights
optimizer.step()
if (epoch + 1) % 10 == 0:
print(f"Epoch {epoch + 1}, Loss: {loss.data:.4f}")
Binary Classification
from smoltorch import Tensor, MLP, SGD, binary_cross_entropy
from sklearn.datasets import load_breast_cancer
from sklearn.preprocessing import StandardScaler
# Load and preprocess data
data = load_breast_cancer()
X, y = data.data, data.target.reshape(-1, 1)
scaler = StandardScaler()
X = scaler.fit_transform(X)
# Create classifier with sigmoid output
class BinaryClassifier(MLP):
def __call__(self, x):
x = super().__call__(x)
return x.sigmoid() # Output probabilities
model = BinaryClassifier([30, 16, 8, 1])
optimizer = SGD(model.parameters(), lr=0.01)
# Training loop
for epoch in range(200):
X_tensor = Tensor(X)
y_tensor = Tensor(y)
y_pred = model(X_tensor)
loss = binary_cross_entropy(y_pred, y_tensor)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if (epoch + 1) % 20 == 0:
accuracy = ((y_pred.data > 0.5) == y).mean()
print(f"Epoch {epoch + 1}, Loss: {loss.data:.4f}, Acc: {accuracy:.4f}")
# Result: ~96% test accuracy on breast cancer dataset! 🎉
📊 Real-World Performance
smoltorch achieves competitive results on standard benchmarks:
| Dataset | Task | Test Accuracy | Epochs |
|---|---|---|---|
| Breast Cancer | Binary Classification | 96.5% | 200 |
| Synthetic Regression | Regression | MSE: 95.7 | 100 |
🏗️ Architecture
Computational Graph
smoltorch builds a dynamic computational graph during the forward pass:
x = Tensor([2.0])
y = Tensor([3.0])
z = (x * y) + (x ** 2) # Graph: z -> [+] -> [*, **] -> [x, y]
z.backward() # Backpropagate through graph
print(x.grad) # dz/dx = y + 2x = 3 + 4 = 7.0
How Autograd Works
- Forward pass: Build computational graph with operations as nodes
- Topological sort: Order nodes for correct gradient flow
- Backward pass: Apply chain rule in reverse topological order
- Gradient accumulation: Sum gradients from multiple paths
Example with broadcasting:
x = Tensor([[1, 2, 3]]) # shape (1, 3)
y = Tensor([[1], [2]]) # shape (2, 1)
z = x + y # shape (2, 3) - broadcasting!
z.backward()
# x.grad sums over broadcast dimensions: shape (1, 3)
# y.grad sums over broadcast dimensions: shape (2, 1)
🧠 Supported Operations
Element-wise Operations
z = x + y # Addition with broadcasting
z = x - y # Subtraction
z = x * y # Multiplication
z = x / y # Division
z = x ** 2 # Power
Matrix Operations
z = x @ y # Matrix multiplication (with batch support)
Activation Functions
z = x.relu() # ReLU: max(0, x)
z = x.tanh() # Tanh: (e^2x - 1) / (e^2x + 1)
z = x.sigmoid() # Sigmoid: 1 / (1 + e^-x)
Reductions
z = x.sum() # Sum all elements
z = x.sum(axis=0) # Sum along axis
z = x.mean() # Mean of all elements
z = x.mean(axis=1) # Mean along axis
Other
z = x.log() # Natural logarithm
📚 Examples
Check out the examples/ directory:
train_regression.py- Train on synthetic regression datatrain_classification.py- Binary classification on breast cancer dataset
Run them:
uv run examples/train_regression.py
uv run examples/train_classification.py
🧪 Testing
Run the test suite:
uv run pytest
Tests cover:
- ✅ Addition with broadcasting
- ✅ Multiplication with broadcasting
- ✅ Matrix multiplication
- ✅ Activation functions (ReLU, tanh, sigmoid)
- ✅ Reductions (sum, mean)
- ✅ Linear layers
- ✅ Multi-layer perceptrons
- ✅ End-to-end training
🗺️ Roadmap
Coming Soon
- More optimizers: Adam, RMSprop with momentum
- More activations: Leaky ReLU, ELU, Softmax
- Regularization: Dropout, L2 weight decay
- Mini-batch training: Efficient batch processing
- Multi-class classification: Softmax + Cross-Entropy loss
Future
- Convolutional layers: CNN support for images
- Model serialization: Save/load weights in safetensors format
- GPU acceleration: Explore Metal Performance Shaders for Apple Silicon
- Better initialization: He initialization for ReLU networks
- Learning rate scheduling: Decay strategies
🎓 Learning Resources
If you're learning from smoltorch, these resources complement it well:
- Andrej Karpathy's micrograd - The original inspiration
- Neural Networks: Zero to Hero - Video series by Andrej Karpathy
- The Matrix Calculus You Need For Deep Learning - Paper on backpropagation math
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Andrej Karpathy for micrograd and the brilliant educational content
- PyTorch team for API design inspiration
- The deep learning community for making knowledge accessible
📬 Contact
Created by Kashif - feel free to reach out!
- GitHub: @kashifulhaque
- Twitter: @notifkash
⭐ Star this repo if you found it helpful!
Built with ❤️ for learners and tinkerers
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smoltorch-0.1.0.tar.gz.
File metadata
- Download URL: smoltorch-0.1.0.tar.gz
- Upload date:
- Size: 13.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
abbf301510d664d7b5251c25d75b5e5704226f181bf3cd772a0055a0bdec266f
|
|
| MD5 |
44408c7474b8c44d937c28376477e93b
|
|
| BLAKE2b-256 |
4304f9ed6c75d0458d285a708b60cc75c84e06bcba107a504a60cfc6cf7b660e
|
File details
Details for the file smoltorch-0.1.0-py3-none-any.whl.
File metadata
- Download URL: smoltorch-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43ac04263681ce70e50f06000862a24d53d4cd4a049f1ba0dcc0562bdc485054
|
|
| MD5 |
8029b3119a9d776da4dce73b412a67d4
|
|
| BLAKE2b-256 |
4d7deb019c4bd6555993cea862535c135a8bc8c6da60285dee26a5f027ab7d12
|