Working with deep learning models

These details have not been verified by PyPI

Project links

Homepage

Project description

QuNet

Easy working with deep learning models.

Large set of custom modules for neural networks (MLP, CNN, Transformer, etc.)
Trainer class for training the model.
Various tools for visualizing the training process and the state of the model.
Training large models: float16, mini-batch splitting if it does not fit in memory, etc.

Install

pip install qunet

Usage

from qunet import Data, Trainer, ExpScheduler

# 1. create dataset
X = torch.rand(10000,1)               
Y = 2*X + 1
data_trn = Data((X,Y), batch_size=128, shuffle=True)

# 2. create trainer, optimizer and scheduler (if need)                                               
tariner = Trainer(model, data_trn)    
trainer.set_optimizer( torch.optim.SGD(model.parameters(), lr=1e-2) )
trainer.set_scheduler( ExpScheduler(lr1=1e-5, lr2=1e-4, samples=100e3) )

# 3. run training
trainer.run(epochs=100, period_plot=5)

Model

Model must be a class (successor of nn.Module) with methods:

The forward function takes input x and returns output y. These can be tensors or tuples (lists) of tensors.
The metrics function takes (x, y_true, y_pred) and returns the model's scalar loss and tensor quality metric: (B,1) for one metric (accuracy, for example) or (B,n) for n quality metrics.

For example, for 1D linear regression $y=f(x)$ with mse-loss and metric as |y_pred-y_true|, model looks like:

class Model(nn.Module):
    def __init__(self):        
        super().__init__() 
        self.fc = nn.Linear( 1, 1 )

    def forward(self, x):                            # (B,1)
        return self.fc(x)                            # (B,1)

    def metrics(self, x, y_pred, y_true)             # (B,1) (B,1)  (B,1)        
        loss   = (y_pred - y_true).pow(2).mean()     # ()     scalar!
        errors = torch.abs(y_pred.detach()-y_true)   # (B,1)  one metric
        return loss, errors                          # ()  (B,1)

Attention: If the output of the model is one, after the Linear layer the tensor has the shape (B,1). Therefore, the target data must also have the form (B,1), otherwise we will get an incorrect loss.

X,Y = torch.arange(5).view(-1,1).to(torch.float32),  torch.arange(5).to(torch.float32)
loss = (X-Y).pow(2).mean()  # 4 С‚Р°Рє РєР°Рє (B,1) - (B,) = (B,1) - (1,B) = (B,B)

Data

The Data - training or validation data class. It can be overridden or pytorch DataLoader can be used. Iterator __next__ of the Data must return an X,Y tuple, where:

X - tensor or tuple (list) of tensors for model input,
Y - tensor or tuple (list) of tensors for model target values.

For example, let's create training data in which two tensors X1,X2 are the input of the model and one tensor Y is the output (target):

X1, X2 = np.rand(1000,3), np.rand(1000,3,20)
Y = X1 * torch.Sigmoid(X2).mean(-1)

data_trn = Data( dataset=( (X1,X2), Y ) )

The data minibatch tuple (X,Y) is used in the Trainer as follows:

for b (X,Y_true) in enumerate(data):  # РїСЂРё РѕР±СѓС‡РµРЅРёРё
    X, Y_true = to_device(X), to_device(Y_true)            
    Y_pred = model(X)
    loss, score = model.metrics(X, Y_pred, Y_true)

So dataset is a list or tuple of two elements (input and target). Each element can be a tensor or a list (tuple) of tensors. All tensors in the dataset are assumed to have the same length (by first index).

The Data class constructor has the following parameters:

Data(dataset, shuffle=True, batch_size=64,  whole_batch=False, n_packs=1)

dataset - model input and output tuple (X, Y), as described above
shuffle - shuffle data after after passing through all examples
batch_size - minibatch size; can be changed later: data_trn.batch_size = 1024
whole_batch - return minibatches of batch_size only; if the total number of examples is not divisible by batch_size, you may end up with one small batch with an unreliable gradient. If whole_batch = True, such a batch will not be issued.
n_packs - data is split into n_packs packs; the passage of one pack is considered an training ephoch. It is used to a large dataset, when it is necessary to do validation more often.

You can also use the standard DataLoader with Trainer:

from torchvision            import datasets
from torchvision.transforms import ToTensor 
from torch.utils.data       import DataLoader

mnist = datasets.MNIST(root='data', train=True,  transform=ToTensor(), download=True)
data_trn  = DataLoader(dataset=mnist, batch_size=1024, shuffle=True)

Trainer

The Trainer is given the model, training and validation data. Using the set_optimizer function, the optimizer is set. After that, the function run is called:

trainer = Trainer(model, data_trn, data_val)
trainer.set_optimizer( torch.optim.SGD(model.parameters(), lr=1e-2) )
trainer.run(epochs=100, pre_val=True, period_plot=10)

You can add different training schedulers, customize the output of training graphs, manage the storage of the best models and checkpoints, and much more.

trainer = Trainer(model, data_trn, data_val, device=None, dtype=torch.float32, score_max=False)

model - model for traininig;
data_trn - training data (Data or DataLoader instance);
data_val - data for validation (instance of Data or DataLoader); may be missing;
score_max - consider that the metric (the first column of the second tensor returned by the function metrics of the model ); should strive to become the maximum (for example, so for accuracy).

Other properties of Trainer allow you to customize the appearance of graphs, save models, manage training, and so on. They will be discussed in the relevant sections.

trainer.run(epochs =None,  samples=None,            
            pre_val=False, period_val=1, period_plot=100,         
            period_checks=1, period_val_beg = 4, samples_beg = None,
            period_call:int = 0, callback = None):

epochs - number of epochs for training (passes of one data_trn pack). If not defined (None) works "infinitely".
samples - if defined, then training will stop after this number of samples, even if epochs has not ended
pre_val - validate before starting training
period_val - period after which validation run (in epochs)
period_plot - period after which the training plot is displayed (in epochs)
period_call - callback custom function call period
callback - custom function called with period_info
period_checks - period after which checkpoints are made and the current model is saved (in epochs)
period_val_beg - validation period on the first samples_beg samples. Used when validation needs to be done less frequently at the start of training.
samples_beg - the number of samples from the start, after which the validation period will be equal to period_val

Visualization of the training process

trainer.view = {                    
    'w'            : 12,         # plt-plot width
    'h'            :  5,         # plt-plot height

    'count_units'  : 1e6,        # units for number of samples
    'time_units'   : 's',        # time units: ms, s, m, h

    'x_min'        : 0,          # minimum value in samples on the x-axis (if < 0 last x_min samples)
    'x_max'        : None,       # maximum value in samples on the x-axis (if None - last)

    'loss': {                                
        'show'  : True,          # show loss subplot
        'y_min' : None,          # fixing the minimum value on the y-axis
        'y_max' : None,          # fixing the maximum value on the y-axis
        'ticks' : None,          # how many labels on the y-axis
        'lr'    : True,          # show learning rate        
        'labels': True,          # show labels (training events)
        'trn_checks': True,      # show the achievement of the minimum training loss (dots)
        'val_checks': True,      # show the achievement of the minimum validation loss (dots)
    },

    'score': {                    
        'show'  : True,          # show score subplot    
        'y_min' : None,          # fixing the minimum value on the y-axis
        'y_max' : None,          # fixing the maximum value on the y-axis
        'ticks' : None,          # how many labels on the y-axis
        'lr'    : True,          # show learning rate
        'labels': True,          # show labels (training events)
        'trn_checks': True,      # show the achievement of the optimum training score (dots)                
        'val_checks': True,      # show the achievement of the optimum validation score (dots)                
    }
}

Using Schedules

Schedulers allow you to control the learning process by changing the learning rate according to the required algorithm. There can be one or more schedulers. In the latter case, they are processed sequentially one after another. РЎСѓС‰РµСЃС‚РІСѓСЋС‚ СЃР»РµРґСѓСЋС‰РёРµ С€РµРґСѓР»РµСЂС‹:

Scheduler_Line(lr1, lr2, samples) - changes the learning rate from lr1 to lr2 over samples training samples. If lr1 is not specified, the optimizer's current lr is used for it.
Scheduler_Exp(lr1, lr2, samples) - similar, but changing lr from lr1 to lr2 is exponential.
Scheduler_Cos(lr1, lr_hot, lr2, samples, warmup) - changing lr by cosine with preliminary linear heating during warmup samples from lr1 to lr_hot.
Scheduler_Const(lr1, samples) - wait for samples samples with unchanged lr (as usual, the last value is taken if lr1 is not set). This scheduler is useful when using lists of schedulers.

Each scheduler has a plot method that can be used to display the training plot:

sch = Scheduler_Cos(lr1=1e-5, lr_hot=1e-2, lr2=1e-4,  samples=100e3, warmup=1e3)
sch.plot(log=True)

You can also call the trainer.plot_schedulers() method of the Trainer class. It will draw the schedule of the list of schedulers added to the trainer.

Compiling a list of schedulers is done by the following methods of the Trainer class:

set_scheduler( sch ) - set a list of schedulers from one scheduler sch (after clearing the list);
add_scheduler( sch ) - add scheduler sch
del_scheduler(i) - remove the i-th scheduler from the list (numbering from zero)

This group of methods works with all schedulers:

reset_schedulers() - reset all scheduler counters and make them active (starting from the first one)
stop_schedulers () - stop all schedulers
clear_schedulers() - clear list of schedulers Example:

Best Model and Checkpoints

If you set these flags to True, then Trainer will save the last best model by validation score and loss:

trainer.copy_best_score_model = True  # to copy the best model by val score
trainer.copy_best_loss_model  = True  # to copy the best model by val loss

These models can be used to roll back if something went wrong (similarly for trainer.best_loss_model):

train.model = copy.deepcopy(trainer.best_score_model)

If the following folder is defined (by default None), then the best model by validation loss, score will be saved on disk and intermediate versions of the model will be saved with the period period_checks (argument of run function).

trainer.folder_loss   = "models/best_loss"   # folder to save the best val loss models
trainer.folder_score  = "models/best_score"  # folder to save the best val score models
trainer.folder_checks = "models/checkpoints" # folder to save checkpoints

The best score is the metric of the first column of the second return tensor in the metrics function of the model. If trainer.score_max=True, then the higher the score, the better (for example, accuracy).

Batch Argumentation

Working with the Large Models

Model State Visualization

Examples

Regression_1D - visualization of changes in model parameters
Interpolation_F(x)
MNIST
Vanishing gradient

Versions

0.0.4 - fixed version for competition IceCube (kaggle)

$$ E=mc^2 $$

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.0.190

Dec 5, 2023

0.0.189

Dec 4, 2023

0.0.188

Dec 4, 2023

0.0.187

Dec 3, 2023

0.0.186

Nov 30, 2023

0.0.185

Nov 30, 2023

0.0.184

Nov 30, 2023

0.0.183

Nov 30, 2023

0.0.182

Nov 30, 2023

0.0.181

Nov 29, 2023

0.0.180

Nov 29, 2023

0.0.179

Nov 28, 2023

0.0.178

Nov 28, 2023

0.0.177

Nov 28, 2023

0.0.176

Nov 28, 2023

0.0.175

Nov 27, 2023

0.0.174

Nov 26, 2023

0.0.173

Nov 26, 2023

0.0.172

Nov 25, 2023

0.0.171

Nov 25, 2023

0.0.170

Nov 25, 2023

0.0.169

Nov 23, 2023

0.0.168

Nov 22, 2023

0.0.167

Nov 21, 2023

0.0.166

Nov 21, 2023

0.0.165

Nov 21, 2023

0.0.164

Nov 20, 2023

0.0.163

Nov 7, 2023

0.0.162

Nov 7, 2023

0.0.161

Oct 17, 2023

0.0.160

Oct 7, 2023

0.0.159

Aug 24, 2023

0.0.158

Jul 27, 2023

0.0.157

Jul 25, 2023

0.0.156

Jul 24, 2023

0.0.155

Jul 20, 2023

0.0.154

Jul 20, 2023

0.0.153

Jul 19, 2023

0.0.152

Jul 19, 2023

0.0.151

Jul 18, 2023

0.0.150

Jul 17, 2023

0.0.149

Jul 17, 2023

0.0.148

Jul 14, 2023

0.0.147

Jul 13, 2023

0.0.146

Jul 12, 2023

0.0.145

Jul 12, 2023

0.0.144

Jul 12, 2023

0.0.143

Jul 12, 2023

0.0.142

Jul 12, 2023

0.0.141

Jul 12, 2023

0.0.140

Jul 12, 2023

0.0.139

Jul 10, 2023

0.0.138

Jul 10, 2023

0.0.137

Jul 10, 2023

0.0.136

Jul 10, 2023

0.0.135

Jul 9, 2023

0.0.134

Jul 8, 2023

0.0.133

Jul 8, 2023

0.0.132

Jul 8, 2023

0.0.131

Jul 8, 2023

0.0.130

Jul 8, 2023

0.0.129

Jul 7, 2023

0.0.128

Jul 7, 2023

0.0.127

Jul 7, 2023

0.0.126

Jul 6, 2023

0.0.125

Jul 5, 2023

0.0.124

Jul 5, 2023

0.0.123

Jul 5, 2023

0.0.122

Jul 5, 2023

0.0.121

Jul 5, 2023

0.0.120

Jul 3, 2023

0.0.119

Jul 2, 2023

0.0.118

Jul 2, 2023

0.0.117

Jul 1, 2023

0.0.116

Jul 1, 2023

0.0.115

Jul 1, 2023

0.0.114

Jun 30, 2023

0.0.113

Jun 30, 2023

0.0.112

Jun 30, 2023

0.0.111

Jun 30, 2023

0.0.110

Jun 29, 2023

0.0.109

Jun 28, 2023

0.0.108

Jun 28, 2023

0.0.107

Jun 27, 2023

0.0.106

Jun 27, 2023

0.0.105

Jun 26, 2023

0.0.104

Jun 26, 2023

0.0.103

Jun 26, 2023

0.0.102

Jun 26, 2023

0.0.101

Jun 26, 2023

0.0.100

Jun 24, 2023

0.0.99

Jun 24, 2023

0.0.98

Jun 23, 2023

0.0.97

Jun 23, 2023

0.0.96

Jun 23, 2023

0.0.95

Jun 22, 2023

0.0.94

Jun 22, 2023

0.0.93

Jun 22, 2023

0.0.92

Jun 21, 2023

0.0.91

Jun 20, 2023

0.0.90

Jun 17, 2023

0.0.89

Jun 16, 2023

0.0.88

Jun 14, 2023

0.0.87

Jun 14, 2023

0.0.86

Jun 14, 2023

0.0.85

Jun 13, 2023

0.0.84

Jun 13, 2023

0.0.83

Jun 13, 2023

0.0.82

Jun 13, 2023

0.0.81

Jun 13, 2023

0.0.80

Jun 13, 2023

0.0.79

Jun 12, 2023

0.0.78

Jun 11, 2023

0.0.77

Jun 11, 2023

0.0.76

Jun 11, 2023

0.0.75

Jun 11, 2023

0.0.74

Jun 6, 2023

0.0.73

Jun 3, 2023

0.0.72

Jun 1, 2023

0.0.71

May 30, 2023

0.0.70

May 22, 2023

0.0.69

May 22, 2023

0.0.68

May 21, 2023

0.0.67

May 21, 2023

0.0.66

May 21, 2023

0.0.65

May 20, 2023

0.0.64

May 19, 2023

0.0.63

May 19, 2023

0.0.62

May 19, 2023

0.0.61

May 19, 2023

0.0.60

May 18, 2023

0.0.59

May 18, 2023

0.0.58

May 18, 2023

0.0.57

May 18, 2023

0.0.56

May 17, 2023

0.0.55

May 17, 2023

0.0.54

May 17, 2023

0.0.53

May 17, 2023

0.0.52

May 16, 2023

0.0.51

May 15, 2023

0.0.50

May 15, 2023

0.0.49

May 14, 2023

0.0.48

May 13, 2023

0.0.47

May 13, 2023

0.0.46

May 12, 2023

0.0.45

May 9, 2023

0.0.44

May 9, 2023

0.0.43

May 8, 2023

0.0.42

May 7, 2023

0.0.41

May 5, 2023

0.0.40

May 5, 2023

0.0.39

May 4, 2023

0.0.38

May 4, 2023

0.0.37

May 3, 2023

0.0.36

May 2, 2023

0.0.35

May 2, 2023

0.0.34

May 1, 2023

0.0.33

May 1, 2023

0.0.32

May 1, 2023

0.0.31

Apr 29, 2023

0.0.30

Apr 29, 2023

0.0.29

Apr 29, 2023

0.0.28

Apr 28, 2023

0.0.27

Apr 28, 2023

0.0.26

Apr 27, 2023

0.0.25

Apr 27, 2023

0.0.24

Apr 27, 2023

0.0.23

Apr 27, 2023

0.0.22

Apr 26, 2023

0.0.21

Apr 26, 2023

0.0.20

Apr 26, 2023

0.0.19

Apr 26, 2023

0.0.18

Apr 26, 2023

0.0.17

Apr 25, 2023

0.0.16

Apr 25, 2023

0.0.15

Apr 25, 2023

0.0.14

Apr 25, 2023

0.0.13

Apr 25, 2023

0.0.12

Apr 23, 2023

0.0.11

Apr 23, 2023

0.0.10

Apr 23, 2023

0.0.9

Apr 23, 2023

0.0.8

Apr 22, 2023

0.0.7

Apr 22, 2023

This version

0.0.6

Apr 22, 2023

0.0.5

Apr 22, 2023

0.0.4

Apr 20, 2023

0.0.3

Apr 20, 2023

0.0.2

Apr 20, 2023

0.0.1

Apr 20, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

QuNet-0.0.6-py3-none-any.whl (30.4 kB view hashes)

Uploaded Apr 22, 2023 Python 3

Hashes for QuNet-0.0.6-py3-none-any.whl

Hashes for QuNet-0.0.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`948589c3f05194c29cac2a9ac662d5ec9b71a0af0bc1ae7e916224b233e00697`
MD5	`18321eb3a64b61867f3ebe14e7036fc0`
BLAKE2b-256	`5aad7fbbb9d9f0fb301f0e27bfba90775464481f710cb31ce58b548330ef4968`