pytorch-model-summary

It is a Keras style model.summary() implementation for PyTorch

These details have not been verified by PyPI

Project links

Homepage

Project description

Pytorch Model Summary -- Keras style `model.summary()` for PyTorch

It is a Keras style model.summary() implementation for PyTorch

This is an Improved PyTorch library of modelsummary. Like in modelsummary, It does not care with number of Input parameter!

Improvements:

For user defined pytorch layers, now summary can show layers inside it
- some assumptions: when is an user defined layer, if any weight/params/bias is trainable, then it is assumed that this layer is trainable (but only trainable params are counted in Tr. Params #)
Adding column counting only trainable parameters (it makes sense when there are user defined layers)
Showing all input/output shapes, instead of showing only the first one
- example: LSTM layer return a Tensor and a tuple (Tensor, Tensor), then output_shape has three set of values
Printing: table width defined dynamically
Adding option to add hierarchical summary in output
Adding batch_size value (when provided) in table footer
fix bugs

Parameters

Default values have keras behavior

summary(model, *inputs, batch_size=-1, show_input=False, show_hierarchical=False,
        print_summary=False, max_depth=1, show_parent_layers=False):

model: pytorch model object
*inputs: ...
batch_size: if provided, it is printed in summary table
show_input: show input shape. Otherwise, output shape for each layer. (Default: False)
show_hierarchical: in addition of summary table, return hierarchical view of the model (Default: False)
print_summary: when true, is not required to use print function outside summary method (Default: False)
max_depth: it specifies how many times it can go inside user defined layers to show them (Default: 1)
show_parent_layer: it adds a column to show parent layers path until reaching current layer in depth. (Default: False)

import torch
import torch.nn as nn
import torch.nn.functional as F

from pytorch_model_summary import summary


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)


# show input shape
print(summary(Net(), torch.zeros((1, 1, 28, 28)), show_input=True))

# show output shape
print(summary(Net(), torch.zeros((1, 1, 28, 28)), show_input=False))

# show output shape and hierarchical view of net
print(summary(Net(), torch.zeros((1, 1, 28, 28)), show_input=False, show_hierarchical=True))

-----------------------------------------------------------------------
      Layer (type)         Input Shape         Param #     Tr. Param #
=======================================================================
          Conv2d-1      [1, 1, 28, 28]             260             260
          Conv2d-2     [1, 10, 12, 12]           5,020           5,020
       Dropout2d-3       [1, 20, 8, 8]               0               0
          Linear-4            [1, 320]          16,050          16,050
          Linear-5             [1, 50]             510             510
=======================================================================
Total params: 21,840
Trainable params: 21,840
Non-trainable params: 0
-----------------------------------------------------------------------

-----------------------------------------------------------------------
      Layer (type)        Output Shape         Param #     Tr. Param #
=======================================================================
          Conv2d-1     [1, 10, 24, 24]             260             260
          Conv2d-2       [1, 20, 8, 8]           5,020           5,020
       Dropout2d-3       [1, 20, 8, 8]               0               0
          Linear-4             [1, 50]          16,050          16,050
          Linear-5             [1, 10]             510             510
=======================================================================
Total params: 21,840
Trainable params: 21,840
Non-trainable params: 0
-----------------------------------------------------------------------

-----------------------------------------------------------------------
      Layer (type)        Output Shape         Param #     Tr. Param #
=======================================================================
          Conv2d-1     [1, 10, 24, 24]             260             260
          Conv2d-2       [1, 20, 8, 8]           5,020           5,020
       Dropout2d-3       [1, 20, 8, 8]               0               0
          Linear-4             [1, 50]          16,050          16,050
          Linear-5             [1, 10]             510             510
=======================================================================
Total params: 21,840
Trainable params: 21,840
Non-trainable params: 0
-----------------------------------------------------------------------
=========================== Hierarchical Summary ===========================
Net(
  (conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1)), 260 params
  (conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1)), 5,020 params
  (conv2_drop): Dropout2d(p=0.5), 0 params
  (fc1): Linear(in_features=320, out_features=50, bias=True), 16,050 params
  (fc2): Linear(in_features=50, out_features=10, bias=True), 510 params
), 21,840 params
============================================================================

Quick Start

Just download with pip

pip install pytorch-model-summary and

from pytorch_model_summary import summary

import pytorch_model_summary as pms
pms.summary([params])

to avoid reference conflicts with other methods in your code

You can use this library like this. If you want to see more detail, Please see examples below.

Examples using different set of parameters

Run example using Transformer Model in Attention is all you need paper(2017)

showing input shape

# show input shape
pms.summary(model, enc_inputs, dec_inputs, show_input=True, print_summary=True)

-----------------------------------------------------------------------------------
      Layer (type)                     Input Shape         Param #     Tr. Param #
===================================================================================
         Encoder-1                          [1, 5]      17,332,224      17,329,152
         Decoder-2     [1, 5], [1, 5], [1, 5, 512]      22,060,544      22,057,472
          Linear-3                     [1, 5, 512]           3,584           3,584
===================================================================================
Total params: 39,396,352
Trainable params: 39,390,208
Non-trainable params: 6,144
-----------------------------------------------------------------------------------

showing output shape and batch_size in table. In addition, also hierarchical summary version

# show output shape and batch_size in table. In addition, also hierarchical summary version
pms.summary(model, enc_inputs, dec_inputs, batch_size=1, show_hierarchical=True, print_summary=True)

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      Layer (type)                                                                                                                                                                            Output Shape         Param #     Tr. Param #
===========================================================================================================================================================================================================================================
         Encoder-1                                                                                         [1, 5, 512], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5]      17,332,224      17,329,152
         Decoder-2     [1, 5, 512], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5], [1, 8, 5, 5]      22,060,544      22,057,472
          Linear-3                                                                                                                                                                               [1, 5, 7]           3,584           3,584
===========================================================================================================================================================================================================================================
Total params: 39,396,352
Trainable params: 39,390,208
Non-trainable params: 6,144
Batch size: 1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


================================ Hierarchical Summary ================================

Transformer(
  (encoder): Encoder(
    (src_emb): Embedding(6, 512), 3,072 params
    (pos_emb): Embedding(6, 512), 3,072 params
    (layers): ModuleList(
      (0): EncoderLayer(
        (enc_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 2,887,680 params
      (1): EncoderLayer(
        (enc_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 2,887,680 params
      (2): EncoderLayer(
        (enc_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 2,887,680 params
      (3): EncoderLayer(
        (enc_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 2,887,680 params
      (4): EncoderLayer(
        (enc_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 2,887,680 params
      (5): EncoderLayer(
        (enc_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 2,887,680 params
    ), 17,326,080 params
  ), 17,332,224 params
  (decoder): Decoder(
    (tgt_emb): Embedding(7, 512), 3,584 params
    (pos_emb): Embedding(6, 512), 3,072 params
    (layers): ModuleList(
      (0): DecoderLayer(
        (dec_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (dec_enc_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 3,675,648 params
      (1): DecoderLayer(
        (dec_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (dec_enc_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 3,675,648 params
      (2): DecoderLayer(
        (dec_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (dec_enc_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 3,675,648 params
      (3): DecoderLayer(
        (dec_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (dec_enc_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 3,675,648 params
      (4): DecoderLayer(
        (dec_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (dec_enc_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 3,675,648 params
      (5): DecoderLayer(
        (dec_self_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (dec_enc_attn): MultiHeadAttention(
          (W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
          (W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
        ), 787,968 params
        (pos_ffn): PoswiseFeedForwardNet(
          (conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
          (conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
        ), 2,099,712 params
      ), 3,675,648 params
    ), 22,053,888 params
  ), 22,060,544 params
  (projection): Linear(in_features=512, out_features=7, bias=False), 3,584 params
), 39,396,352 params


======================================================================================

showing layers until depth 2

# show layers until depth 2
pms.summary(model, enc_inputs, dec_inputs, max_depth=2, print_summary=True)

-----------------------------------------------------------------------------------------------
      Layer (type)                                Output Shape         Param #     Tr. Param #
===============================================================================================
       Embedding-1                                 [1, 5, 512]           3,072           3,072
       Embedding-2                                 [1, 5, 512]           3,072               0
    EncoderLayer-3                   [1, 5, 512], [1, 8, 5, 5]       2,887,680       2,887,680
    EncoderLayer-4                   [1, 5, 512], [1, 8, 5, 5]       2,887,680       2,887,680
    EncoderLayer-5                   [1, 5, 512], [1, 8, 5, 5]       2,887,680       2,887,680
    EncoderLayer-6                   [1, 5, 512], [1, 8, 5, 5]       2,887,680       2,887,680
    EncoderLayer-7                   [1, 5, 512], [1, 8, 5, 5]       2,887,680       2,887,680
    EncoderLayer-8                   [1, 5, 512], [1, 8, 5, 5]       2,887,680       2,887,680
       Embedding-9                                 [1, 5, 512]           3,584           3,584
      Embedding-10                                 [1, 5, 512]           3,072               0
   DecoderLayer-11     [1, 5, 512], [1, 8, 5, 5], [1, 8, 5, 5]       3,675,648       3,675,648
   DecoderLayer-12     [1, 5, 512], [1, 8, 5, 5], [1, 8, 5, 5]       3,675,648       3,675,648
   DecoderLayer-13     [1, 5, 512], [1, 8, 5, 5], [1, 8, 5, 5]       3,675,648       3,675,648
   DecoderLayer-14     [1, 5, 512], [1, 8, 5, 5], [1, 8, 5, 5]       3,675,648       3,675,648
   DecoderLayer-15     [1, 5, 512], [1, 8, 5, 5], [1, 8, 5, 5]       3,675,648       3,675,648
   DecoderLayer-16     [1, 5, 512], [1, 8, 5, 5], [1, 8, 5, 5]       3,675,648       3,675,648
         Linear-17                                   [1, 5, 7]           3,584           3,584
===============================================================================================
Total params: 39,396,352
Trainable params: 39,390,208
Non-trainable params: 6,144
-----------------------------------------------------------------------------------------------

showing deepest layers

# show deepest layers
pms.summary(model, enc_inputs, dec_inputs, max_depth=None, print_summary=True)

-----------------------------------------------------------------------
      Layer (type)        Output Shape         Param #     Tr. Param #
=======================================================================
       Embedding-1         [1, 5, 512]           3,072           3,072
       Embedding-2         [1, 5, 512]           3,072               0
          Linear-3         [1, 5, 512]         262,656         262,656
          Linear-4         [1, 5, 512]         262,656         262,656
          Linear-5         [1, 5, 512]         262,656         262,656
          Conv1d-6        [1, 2048, 5]       1,050,624       1,050,624
          Conv1d-7         [1, 512, 5]       1,049,088       1,049,088
          Linear-8         [1, 5, 512]         262,656         262,656
          Linear-9         [1, 5, 512]         262,656         262,656
         Linear-10         [1, 5, 512]         262,656         262,656
         Conv1d-11        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-12         [1, 512, 5]       1,049,088       1,049,088
         Linear-13         [1, 5, 512]         262,656         262,656
         Linear-14         [1, 5, 512]         262,656         262,656
         Linear-15         [1, 5, 512]         262,656         262,656
         Conv1d-16        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-17         [1, 512, 5]       1,049,088       1,049,088
         Linear-18         [1, 5, 512]         262,656         262,656
         Linear-19         [1, 5, 512]         262,656         262,656
         Linear-20         [1, 5, 512]         262,656         262,656
         Conv1d-21        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-22         [1, 512, 5]       1,049,088       1,049,088
         Linear-23         [1, 5, 512]         262,656         262,656
         Linear-24         [1, 5, 512]         262,656         262,656
         Linear-25         [1, 5, 512]         262,656         262,656
         Conv1d-26        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-27         [1, 512, 5]       1,049,088       1,049,088
         Linear-28         [1, 5, 512]         262,656         262,656
         Linear-29         [1, 5, 512]         262,656         262,656
         Linear-30         [1, 5, 512]         262,656         262,656
         Conv1d-31        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-32         [1, 512, 5]       1,049,088       1,049,088
      Embedding-33         [1, 5, 512]           3,584           3,584
      Embedding-34         [1, 5, 512]           3,072               0
         Linear-35         [1, 5, 512]         262,656         262,656
         Linear-36         [1, 5, 512]         262,656         262,656
         Linear-37         [1, 5, 512]         262,656         262,656
         Linear-38         [1, 5, 512]         262,656         262,656
         Linear-39         [1, 5, 512]         262,656         262,656
         Linear-40         [1, 5, 512]         262,656         262,656
         Conv1d-41        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-42         [1, 512, 5]       1,049,088       1,049,088
         Linear-43         [1, 5, 512]         262,656         262,656
         Linear-44         [1, 5, 512]         262,656         262,656
         Linear-45         [1, 5, 512]         262,656         262,656
         Linear-46         [1, 5, 512]         262,656         262,656
         Linear-47         [1, 5, 512]         262,656         262,656
         Linear-48         [1, 5, 512]         262,656         262,656
         Conv1d-49        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-50         [1, 512, 5]       1,049,088       1,049,088
         Linear-51         [1, 5, 512]         262,656         262,656
         Linear-52         [1, 5, 512]         262,656         262,656
         Linear-53         [1, 5, 512]         262,656         262,656
         Linear-54         [1, 5, 512]         262,656         262,656
         Linear-55         [1, 5, 512]         262,656         262,656
         Linear-56         [1, 5, 512]         262,656         262,656
         Conv1d-57        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-58         [1, 512, 5]       1,049,088       1,049,088
         Linear-59         [1, 5, 512]         262,656         262,656
         Linear-60         [1, 5, 512]         262,656         262,656
         Linear-61         [1, 5, 512]         262,656         262,656
         Linear-62         [1, 5, 512]         262,656         262,656
         Linear-63         [1, 5, 512]         262,656         262,656
         Linear-64         [1, 5, 512]         262,656         262,656
         Conv1d-65        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-66         [1, 512, 5]       1,049,088       1,049,088
         Linear-67         [1, 5, 512]         262,656         262,656
         Linear-68         [1, 5, 512]         262,656         262,656
         Linear-69         [1, 5, 512]         262,656         262,656
         Linear-70         [1, 5, 512]         262,656         262,656
         Linear-71         [1, 5, 512]         262,656         262,656
         Linear-72         [1, 5, 512]         262,656         262,656
         Conv1d-73        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-74         [1, 512, 5]       1,049,088       1,049,088
         Linear-75         [1, 5, 512]         262,656         262,656
         Linear-76         [1, 5, 512]         262,656         262,656
         Linear-77         [1, 5, 512]         262,656         262,656
         Linear-78         [1, 5, 512]         262,656         262,656
         Linear-79         [1, 5, 512]         262,656         262,656
         Linear-80         [1, 5, 512]         262,656         262,656
         Conv1d-81        [1, 2048, 5]       1,050,624       1,050,624
         Conv1d-82         [1, 512, 5]       1,049,088       1,049,088
         Linear-83           [1, 5, 7]           3,584           3,584
=======================================================================
Total params: 39,396,352
Trainable params: 39,390,208
Non-trainable params: 6,144
-----------------------------------------------------------------------

showing layers until depth 3 and adding column with parent layers

# show layers until depth 3 and add column with parent layers
pms.summary(model, enc_inputs, dec_inputs, max_depth=3, show_parent_layers=True, print_summary=True)

-----------------------------------------------------------------------------------------------------------------------------
                      Parent Layers                Layer (type)                  Output Shape         Param #     Tr. Param #
=============================================================================================================================
                Transformer/Encoder                 Embedding-1                   [1, 5, 512]           3,072           3,072
                Transformer/Encoder                 Embedding-2                   [1, 5, 512]           3,072               0
   Transformer/Encoder/EncoderLayer        MultiHeadAttention-3     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Encoder/EncoderLayer     PoswiseFeedForwardNet-4                   [1, 5, 512]       2,099,712       2,099,712
   Transformer/Encoder/EncoderLayer        MultiHeadAttention-5     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Encoder/EncoderLayer     PoswiseFeedForwardNet-6                   [1, 5, 512]       2,099,712       2,099,712
   Transformer/Encoder/EncoderLayer        MultiHeadAttention-7     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Encoder/EncoderLayer     PoswiseFeedForwardNet-8                   [1, 5, 512]       2,099,712       2,099,712
   Transformer/Encoder/EncoderLayer        MultiHeadAttention-9     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Encoder/EncoderLayer    PoswiseFeedForwardNet-10                   [1, 5, 512]       2,099,712       2,099,712
   Transformer/Encoder/EncoderLayer       MultiHeadAttention-11     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Encoder/EncoderLayer    PoswiseFeedForwardNet-12                   [1, 5, 512]       2,099,712       2,099,712
   Transformer/Encoder/EncoderLayer       MultiHeadAttention-13     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Encoder/EncoderLayer    PoswiseFeedForwardNet-14                   [1, 5, 512]       2,099,712       2,099,712
                Transformer/Decoder                Embedding-15                   [1, 5, 512]           3,584           3,584
                Transformer/Decoder                Embedding-16                   [1, 5, 512]           3,072               0
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-17     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-18     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer    PoswiseFeedForwardNet-19                   [1, 5, 512]       2,099,712       2,099,712
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-20     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-21     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer    PoswiseFeedForwardNet-22                   [1, 5, 512]       2,099,712       2,099,712
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-23     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-24     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer    PoswiseFeedForwardNet-25                   [1, 5, 512]       2,099,712       2,099,712
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-26     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-27     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer    PoswiseFeedForwardNet-28                   [1, 5, 512]       2,099,712       2,099,712
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-29     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-30     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer    PoswiseFeedForwardNet-31                   [1, 5, 512]       2,099,712       2,099,712
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-32     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer       MultiHeadAttention-33     [1, 5, 512], [1, 8, 5, 5]         787,968         787,968
   Transformer/Decoder/DecoderLayer    PoswiseFeedForwardNet-34                   [1, 5, 512]       2,099,712       2,099,712
                        Transformer                   Linear-35                     [1, 5, 7]           3,584           3,584
=============================================================================================================================
Total params: 39,396,352
Trainable params: 39,390,208
Non-trainable params: 6,144
-----------------------------------------------------------------------------------------------------------------------------

Reference

code_reference = { 	'https://github.com/graykode/modelsummary', 
					'https://github.com/pytorch/pytorch/issues/2001',
					'https://gist.github.com/HTLife/b6640af9d6e7d765411f8aa9aa94b837',
					'https://github.com/sksq96/pytorch-summary',
					'Inspired by https://github.com/sksq96/pytorch-summary'}

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.2

Aug 30, 2020

0.1.1

Dec 24, 2019

0.1

Dec 23, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytorch_model_summary-0.1.2.tar.gz (14.9 kB view details)

Uploaded Aug 30, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pytorch_model_summary-0.1.2-py3-none-any.whl (9.3 kB view details)

Uploaded Aug 30, 2020 Python 3

File details

Details for the file pytorch_model_summary-0.1.2.tar.gz.

File metadata

Download URL: pytorch_model_summary-0.1.2.tar.gz
Upload date: Aug 30, 2020
Size: 14.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/49.2.0.post20200714 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for pytorch_model_summary-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`1008a4b94943f1e534d08889dcf77861beb7796ae7193ab90677e50513c250fe`
MD5	`37fb4ae8b8a9f8307b417a0a28a531e0`
BLAKE2b-256	`65279d203b258a2a0cde53eabecadfd865e0d64de370bb87b4d013c4dcf091dd`

See more details on using hashes here.

File details

Details for the file pytorch_model_summary-0.1.2-py3-none-any.whl.

File metadata

Download URL: pytorch_model_summary-0.1.2-py3-none-any.whl
Upload date: Aug 30, 2020
Size: 9.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/49.2.0.post20200714 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for pytorch_model_summary-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ba71f46038a2c6d71e1275f2e61ff0bb367f54dc8df942a5643a5bbd280482db`
MD5	`b2023742aaf7f94c75e79f96bdcdd935`
BLAKE2b-256	`fe4501d67be55fe3683a9221ac956ba46d1ca32da7bf96029b8d1c7667b8a55c`

See more details on using hashes here.

pytorch-model-summary 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Pytorch Model Summary -- Keras style `model.summary()` for PyTorch

Improvements:

Parameters

Quick Start

Examples using different set of parameters

Reference

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

pytorch-model-summary 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Pytorch Model Summary -- Keras style model.summary() for PyTorch

Improvements:

Parameters

Quick Start

Examples using different set of parameters

Reference

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Pytorch Model Summary -- Keras style `model.summary()` for PyTorch