Skip to main content

Working with deep learning models

Project description

QuNet

Python 3.7+ PyPI version

Easy working with deep learning models.

  • Trainer class for training the model.
  • Various tools for visualizing the training process and the state of the model.
  • Training large models: float16, mini-batch splitting, etc.
  • Large set of custom modules for neural networks (MLP, CNN, Transformer, etc.)

Install

pip install qunet

Usage

To work with the library, it is enough to add training_step(batch, batch_id) to the model, in which to calculate the loss and, if necessary, some quality metrics. For example, for 1D linear regression $y=f(x)$ with mse-loss and metric as |y_pred-y_true|, model looks like:

class Model(nn.Module):
    def __init__(self):              
        super().__init__() 
        self.fc = nn.Linear( 1, 1 )

    def forward(self, x):                                 # (B,1)
        return self.fc(x)                                 # (B,1)

    def training_step(self, batch, batch_id):        
        x, y_true = batch                                 # the model knows the minbatch format
        y_pred = self(x)                                  # (B,1)  forward function call

        loss  = (y_pred - y_true).pow(2).mean()           # ()     loss for optimization (scalar)!
        error = torch.abs(y_pred.detach()-y_true).mean()  # (B,1)  error for batch samples

        return {'loss':loss, 'score': error}              # if no score, you can return loss

model = Model()        

Training and validation datasets can be standard DataLoader. For small datasets, you can also use the faster loader Data from the library:

from qunet import Data, Trainer

num, val = 1000, 900
X = torch.rand(num)
Y = 2*X + torch.randn(X.shape)

data_trn = Data( (X[:val], Y[:val]) )
data_val = Data( (X[val:], Y[val:]) )

After that, we create an instance of the trainer, pass the model and data to it. Set the optimizer at the trainer and start training:

trainer = Trainer(model, data_trn, data_val)
trainer.set_optimizer( torch.optim.SGD(model.parameters(), lr=1e-2) )
trainer.fit(epochs=10, period_plot=5, monitor=['loss'])

This is all!

Let's make a small overview of the library. A more detailed introduction can be found in the document Quick start, documents describing the various modules of the library, and notebooks dedicated to various deep learning tasks.


Trainer

The trainer is a key object of the QuNet library. It solves the following tasks:

  • Model training and validation.
  • Visualization of the learning process, with ample opportunities for its customization.
  • Calculation of optimal breakpoints based on the best local and smoothed metrics.
  • Saving the best models by loss or score, as well as saving checkpoints.
  • Combining different training schedulers
  • For large models, switch to half precision and use the gradient accumulation buffer.
  • Use of multiple callback objects that can be embedded in different parts of the pipeline.

Below is an example of visualization:

val_loss:  best = 0.190465[293], smooth21 = 0.199713[296], last21 = 0.210965 В± 0.019436
trn_loss:  best = 0.209042[234], smooth21 = 0.244457[299], last21 = 0.293281 В± 0.043728

val_score: best = 0.942300[291], smooth21 = 0.938188[295], last21 = 0.934581 В± 0.000000
trn_score: best = 0.929560[234], smooth21 = 0.916017[299], last21 = 0.898531 В± 0.005823

epochs=300, samples=15000000, steps=30000
times=(trn:214.34, val:11.69)m,  42.87 s/epoch, 428.68 s/10^3 steps,  857.35 s/10^6 samples

Example of learning curves of various schedulers:


ModelState

The standalone ModelState class is a powerful replacement for libraries such as torchinfo. It allows you to display information about submodules and their parameters.

Transformer                            params           data
в”њв”Ђ ModuleList                                                           ->                 <  blocks
в”‚  в””в”Ђ TransformerBlock                                   (1, 10, 64)    -> (1, 10, 64)     <  blocks[0]
в”‚     в””в”Ђ Residual                                        (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].fft
в”‚        в””в”Ђ FFT                                          (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].fft.module
в”‚           в””в”Ђ Dropout(0)                                (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].fft.module.drop        
в”‚        в””в”Ђ LayerNorm                     128         |  (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].fft.norm
в”‚     в””в”Ђ Residual                                        (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].att
в”‚        в””в”Ђ Attention                                    (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].att.module
в”‚           в””в”Ђ Linear(64->192)         12,480  ~  25% |  (1, 10, 64)    -> (1, 10, 192)    <  blocks[0].att.module.c_attn      
в”‚           в””в”Ђ Linear(64->64)           4,160  ~   8% |  (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].att.module.c_proj      
в”‚           в””в”Ђ Dropout(0)                                (1, 4, 10, 10) -> (1, 4, 10, 10)  <  blocks[0].att.module.att_dropout 
в”‚           в””в”Ђ Dropout(0)                                (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].att.module.res_dropout 
в”‚        в””в”Ђ LayerNorm                     128         |  (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].att.norm
в”‚     в””в”Ђ Residual                                        (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].mlp
в”‚        в””в”Ђ MLP                                          (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].mlp.module
в”‚           в””в”Ђ Sequential                                (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].mlp.module.layers      
в”‚              в””в”Ђ Linear(64->256)      16,640  ~  33% |  (1, 10, 64)    -> (1, 10, 256)    <  blocks[0].mlp.module.layers[0]   
в”‚              в””в”Ђ GELU                                   (1, 10, 256)   -> (1, 10, 256)    <  blocks[0].mlp.module.layers[1]   
в”‚              в””в”Ђ Dropout(0)                             (1, 10, 256)   -> (1, 10, 256)    <  blocks[0].mlp.module.layers[2]   
в”‚              в””в”Ђ Linear(256->64)      16,448  ~  33% |  (1, 10, 256)   -> (1, 10, 64)     <  blocks[0].mlp.module.layers[3]   
в”‚        в””в”Ђ LayerNorm                     128         |  (1, 10, 64)    -> (1, 10, 64)     <  blocks[0].mlp.norm
=============================================
trainable:                             50,115

During training, ModelState keeps track of gradients and smoothes values:

 #                                           params          |mean|  [     min,      max ]  |grad|   shape
-------------------------------------------------------------------------------------
  0: blocks.0.fft.gamma                            1           0.200  [   0.200,    0.200]   1.3e+02  ()
  1: blocks.0.fft.norm.weight                     64           1.000  [   1.000,    1.000]   4.7e-01  (64,)
  2: blocks.0.fft.norm.bias                       64           0.000  [   0.000,    0.000]   2.2e-01  (64,)
  ...

Modules

The library has many ready-made modules for building various architectures of neural networks:

  • MLP
  • Transformer
  • CNN
  • ResCNN
  • ProjViT
  • ResCNN3D
  • GNN

Most modules have debugging and visualization tools. For example, this is how the visualization of the learning process of a transformer, consisting of 10 blocks, looks like.

Such diagrams allow you to analyze the problem areas of the network and change them in the learning process.


Docs


Examples

  • Interpolation_F(x) - interpolation of a function of one variable (example of setting up a training plot; working with the list of schedulers; adding a custom plot)
  • MNIST - recognition of handwritten digits 0-9 (example using pytorch DataLoader, model predict, show errors, confusion matrix)
  • CIFAR10 (truncated EfficientNet, pre-trained parameters, bone freezing, augmentation)
  • Vanishing gradient
  • Regression_1D - visualization of changes in model parameters

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

QuNet-0.0.133-py3-none-any.whl (86.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page