Skip to main content

A deep learning framework

Project description

PyNorch

Recreating PyTorch from scratch (C/C++, CUDA and Python, with multi-GPU support and automatic differentiation!)

Project details explanations can also be found on medium.

1 - About

PyNorch is a deep learning framework constructed using C/C++, CUDA and Python. This is a personal project with educational purpose only! Norch means NOT PyTorch, and we have NO claims to rivaling the already established PyTorch. The main objective of PyNorch was to give a brief understanding of how a deep learning framework works internally. It implements the Tensor object, multi-GPU support and an automatic differentiation system.

2 - Installation

Install this package from PyPi (you can test on Colab! Also tested on AWS g4dn.12xlarge instance with image ami-061debf863768593d)

$ pip install norch

or from cloning this repository

$ git clone https://github.com/lucasdelimanogueira/PyNorch.git
$ cd PyNorch
$ pip install . -v

3 - Get started

3.1 - Tensor operations

import norch

x1 = norch.Tensor([[1, 2], 
                  [3, 4]], requires_grad=True).to("cuda")

x2 = norch.Tensor([[4, 3], 
                  [2, 1]], requires_grad=True).to("cuda")

x3 = x1 @ x2
result = x3.sum()
result.backward

print(x1.grad)

3.2 - Create a model

import norch
import norch.nn as nn
import norch.optim as optim

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(1, 10)
        self.sigmoid = nn.Sigmoid()
        self.fc2 = nn.Linear(10, 1)

    def forward(self, x):
        out = self.fc1(x)
        out = self.sigmoid(out)
        out = self.fc2(out)
        
        return out

3.3 - Example single GPU training

# examples/train_singlegpu.py

import norch
from norch.utils.data.dataloader import DataLoader
from norch.norchvision import transforms as T
import norch
import norch.nn as nn
import norch.optim as optim
import random
random.seed(1)

BATCH_SIZE = 32
device = "cuda" #cpu
epochs = 10

transform = T.Compose(
    [
        T.ToTensor(),
        T.Reshape([-1, 784, 1])
    ]
)

target_transform = T.Compose(
    [
        T.ToTensor()
    ]
)

train_data, test_data = norch.norchvision.datasets.MNIST.splits(transform=transform, target_transform=target_transform)
train_loader = DataLoader(train_data, batch_size = BATCH_SIZE)

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(784, 30)
        self.sigmoid1 = nn.Sigmoid()
        self.fc2 = nn.Linear(30, 10)
        self.sigmoid2 = nn.Sigmoid()

    def forward(self, x):
        out = self.fc1(x)
        out = self.sigmoid1(out)
        out = self.fc2(out)
        out = self.sigmoid2(out)
        
        return out

model = MyModel().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
loss_list = []

for epoch in range(epochs):    
    for idx, batch in enumerate(train_loader):

        inputs, target = batch

        inputs = inputs.to(device)
        target = target.to(device)

        outputs = model(inputs)
        
        loss = criterion(outputs, target)
        
        optimizer.zero_grad()
        
        loss.backward()

        optimizer.step()

    print(f'Epoch [{epoch + 1}/{epochs}], Loss: {loss[0]:.4f}')
    loss_list.append(loss[0])

3.4 - Example multi-GPU training

First create a file .py as the example below

# examples/train_multigpu.py

import os
import norch
import norch.distributed as dist
import norch.distributed
import norch.nn as nn
import norch.optim as optim
from norch.nn.parallel import DistributedDataParallel
from norch.utils.data.distributed import DistributedSampler
from norch.norchvision import transforms as T
import random
random.seed(1)

local_rank = int(os.getenv('OMPI_COMM_WORLD_LOCAL_RANK', -1))
rank = int(os.getenv('OMPI_COMM_WORLD_RANK', -1))
world_size = int(os.getenv('OMPI_COMM_WORLD_SIZE', -1))

dist.init_process_group(
    rank, 
    world_size
)

BATCH_SIZE = 32
device = local_rank
epochs = 10

transform = T.Compose(
    [
        T.ToTensor(),
        T.Reshape([-1, 784, 1])
    ]
)

target_transform = T.Compose(
    [
        T.ToTensor()
    ]
)

train_data, test_data = norch.norchvision.datasets.MNIST.splits(transform=transform, target_transform=target_transform)
distributed_sampler = DistributedSampler(dataset=train_data, num_replicas=world_size, rank=rank)
train_loader = norch.utils.data.DataLoader(train_data, batch_size=BATCH_SIZE, sampler=distributed_sampler)

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(784, 30)
        self.sigmoid1 = nn.Sigmoid()
        self.fc2 = nn.Linear(30, 10)
        self.sigmoid2 = nn.Sigmoid()

    def forward(self, x):
        out = self.fc1(x)
        out = self.sigmoid1(out)
        out = self.fc2(out)
        out = self.sigmoid2(out)
        
        return out

model = MyModel().to(device)
model = DistributedDataParallel(model)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
loss_list = []

print(f"Starting training on Rank {rank}/{world_size}\n\n")

for epoch in range(epochs):    
    for idx, batch in enumerate(train_loader):

        inputs, target = batch

        inputs = inputs.to(device)
        target = target.to(device)

        outputs = model(inputs)
        
        loss = criterion(outputs, target)
        
        optimizer.zero_grad()
        
        loss.backward()

        optimizer.step()
    
    if rank == 0:
        print(f'Epoch [{epoch + 1}/{epochs}], Loss: {loss[0]:.4f}')
        loss_list.append(loss[0])

Then you can run using

$ python3 -m norch.distributed.run --nproc_per_node 4 examples/train_multigpu.py

4 - Progress

Development Status Feature
Operations in progress
  • [X] GPU Support
  • [X] Autograd
  • [X] Broadcasting
  • [ ] Memory Management
Loss in progress
  • [x] MSE
  • [X] Cross Entropy
Data in progress
  • [X] Dataset
  • [X] Batch
  • [X] Iterator
Convolutional Neural Network in progress
  • [ ] Conv2d
  • [ ] MaxPool2d
  • [ ] Dropout
Distributed in progress
  • [X] All-reduce
  • [X] Broadcast
  • [X] DistributedSampler
  • [X] DistributedDataParallel

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

norch-0.0.7.tar.gz (139.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

norch-0.0.7-py3-none-any.whl (147.7 kB view details)

Uploaded Python 3

File details

Details for the file norch-0.0.7.tar.gz.

File metadata

  • Download URL: norch-0.0.7.tar.gz
  • Upload date:
  • Size: 139.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for norch-0.0.7.tar.gz
Algorithm Hash digest
SHA256 af52492eac96e375c78f48689461ef083b131021cbe6e1965075d221e3650942
MD5 609a8a4af3d8c3e56cd36d7f7ab7f6cd
BLAKE2b-256 c74c59ccac301386328db927146c33ada642c52f6a2607992af7b9f49d363ff8

See more details on using hashes here.

File details

Details for the file norch-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: norch-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 147.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.19

File hashes

Hashes for norch-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 35638cf523a2b9841abed2ed08053f9600cdd91c5c9812fc0c592e00bc893a3c
MD5 d6da0df92cd9bb93138c9627195c7497
BLAKE2b-256 5533a0e8836b09fe0083b1aec5619f831af71f9fa94bc2a114d5775ac29dbc89

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page