Skip to main content

All Model summary in PyTorch similar to `model.summary()` in Keras

Project description

## modelsummary (Pytorch Model summary)

> Keras style model.summary() in PyTorch, [torchsummary](https://github.com/sksq96/pytorch-summary)

This is Pytorch library for visualization Improved tool of [torchsummary](https://github.com/sksq96/pytorch-summary) and [torchsummaryX](https://github.com/nmhkahn/torchsummaryX). I was inspired by [torchsummary](https://github.com/sksq96/pytorch-summary) and I written down code which i referred to. **It is not care with number of Input parameter!**



## Quick Start

Just download with pip `modelsummary`

`pip install modelsummary` and `import modelsummary as summary`

You can use this library like this. If you see more detail, Please see example code.

```
from modelsummary import summary

model = your_model_name()

# show input shape
summary(model, (input tensor you want), intputshow=True)

# show output shape
summary(model, (input tensor you want), outputshow=True)

# show hierarchical struct
summary(model, (input tensor you want), hierarchical=True)
```



summary function has this parameter options`def summary(model, *inputs, batch_size=-1, intputshow=True, outputshow=False, hierarchical=False)`

#### Options

- model : your model class
- *input : your input tensor **datas** (Asterisk)
- batch_size : `-1` is same with tensor `None`
- intputshow : show input shape data, **default : True**
- outputshow : show output shape data, **default : False**
- hierarchical : show hierarchical data structure, ***default : False**



## Result

Run example using Transformer Model in [Attention is all you need paper(2017)](https://arxiv.org/abs/1706.03762)

1) showing input shape

```python
# show input shape
summary(model, enc_inputs, dec_inputs, intputshow=True)

-----------------------------------------------------------------------
Layer (type) Input Shape Param #
=======================================================================
Encoder-1 [-1, 5] 0
Embedding-2 [-1, 5] 3,072
Embedding-3 [-1, 5] 3,072
EncoderLayer-4 [-1, 5, 512] 0
MultiHeadAttention-5 [-1, 5, 512] 0
Linear-6 [-1, 5, 512] 262,656
Linear-7 [-1, 5, 512] 262,656
Linear-8 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-9 [-1, 5, 512] 0
Conv1d-10 [-1, 512, 5] 1,050,624
Conv1d-11 [-1, 2048, 5] 1,049,088
EncoderLayer-12 [-1, 5, 512] 0
MultiHeadAttention-13 [-1, 5, 512] 0
Linear-14 [-1, 5, 512] 262,656
Linear-15 [-1, 5, 512] 262,656
Linear-16 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-17 [-1, 5, 512] 0
Conv1d-18 [-1, 512, 5] 1,050,624
Conv1d-19 [-1, 2048, 5] 1,049,088
EncoderLayer-20 [-1, 5, 512] 0
MultiHeadAttention-21 [-1, 5, 512] 0
Linear-22 [-1, 5, 512] 262,656
Linear-23 [-1, 5, 512] 262,656
Linear-24 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-25 [-1, 5, 512] 0
Conv1d-26 [-1, 512, 5] 1,050,624
Conv1d-27 [-1, 2048, 5] 1,049,088
EncoderLayer-28 [-1, 5, 512] 0
MultiHeadAttention-29 [-1, 5, 512] 0
Linear-30 [-1, 5, 512] 262,656
Linear-31 [-1, 5, 512] 262,656
Linear-32 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-33 [-1, 5, 512] 0
Conv1d-34 [-1, 512, 5] 1,050,624
Conv1d-35 [-1, 2048, 5] 1,049,088
EncoderLayer-36 [-1, 5, 512] 0
MultiHeadAttention-37 [-1, 5, 512] 0
Linear-38 [-1, 5, 512] 262,656
Linear-39 [-1, 5, 512] 262,656
Linear-40 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-41 [-1, 5, 512] 0
Conv1d-42 [-1, 512, 5] 1,050,624
Conv1d-43 [-1, 2048, 5] 1,049,088
EncoderLayer-44 [-1, 5, 512] 0
MultiHeadAttention-45 [-1, 5, 512] 0
Linear-46 [-1, 5, 512] 262,656
Linear-47 [-1, 5, 512] 262,656
Linear-48 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-49 [-1, 5, 512] 0
Conv1d-50 [-1, 512, 5] 1,050,624
Conv1d-51 [-1, 2048, 5] 1,049,088
Decoder-52 [-1, 5] 0
Embedding-53 [-1, 5] 3,584
Embedding-54 [-1, 5] 3,072
DecoderLayer-55 [-1, 5, 512] 0
MultiHeadAttention-56 [-1, 5, 512] 0
Linear-57 [-1, 5, 512] 262,656
Linear-58 [-1, 5, 512] 262,656
Linear-59 [-1, 5, 512] 262,656
MultiHeadAttention-60 [-1, 5, 512] 0
Linear-61 [-1, 5, 512] 262,656
Linear-62 [-1, 5, 512] 262,656
Linear-63 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-64 [-1, 5, 512] 0
Conv1d-65 [-1, 512, 5] 1,050,624
Conv1d-66 [-1, 2048, 5] 1,049,088
DecoderLayer-67 [-1, 5, 512] 0
MultiHeadAttention-68 [-1, 5, 512] 0
Linear-69 [-1, 5, 512] 262,656
Linear-70 [-1, 5, 512] 262,656
Linear-71 [-1, 5, 512] 262,656
MultiHeadAttention-72 [-1, 5, 512] 0
Linear-73 [-1, 5, 512] 262,656
Linear-74 [-1, 5, 512] 262,656
Linear-75 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-76 [-1, 5, 512] 0
Conv1d-77 [-1, 512, 5] 1,050,624
Conv1d-78 [-1, 2048, 5] 1,049,088
DecoderLayer-79 [-1, 5, 512] 0
MultiHeadAttention-80 [-1, 5, 512] 0
Linear-81 [-1, 5, 512] 262,656
Linear-82 [-1, 5, 512] 262,656
Linear-83 [-1, 5, 512] 262,656
MultiHeadAttention-84 [-1, 5, 512] 0
Linear-85 [-1, 5, 512] 262,656
Linear-86 [-1, 5, 512] 262,656
Linear-87 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-88 [-1, 5, 512] 0
Conv1d-89 [-1, 512, 5] 1,050,624
Conv1d-90 [-1, 2048, 5] 1,049,088
DecoderLayer-91 [-1, 5, 512] 0
MultiHeadAttention-92 [-1, 5, 512] 0
Linear-93 [-1, 5, 512] 262,656
Linear-94 [-1, 5, 512] 262,656
Linear-95 [-1, 5, 512] 262,656
MultiHeadAttention-96 [-1, 5, 512] 0
Linear-97 [-1, 5, 512] 262,656
Linear-98 [-1, 5, 512] 262,656
Linear-99 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-100 [-1, 5, 512] 0
Conv1d-101 [-1, 512, 5] 1,050,624
Conv1d-102 [-1, 2048, 5] 1,049,088
DecoderLayer-103 [-1, 5, 512] 0
MultiHeadAttention-104 [-1, 5, 512] 0
Linear-105 [-1, 5, 512] 262,656
Linear-106 [-1, 5, 512] 262,656
Linear-107 [-1, 5, 512] 262,656
MultiHeadAttention-108 [-1, 5, 512] 0
Linear-109 [-1, 5, 512] 262,656
Linear-110 [-1, 5, 512] 262,656
Linear-111 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-112 [-1, 5, 512] 0
Conv1d-113 [-1, 512, 5] 1,050,624
Conv1d-114 [-1, 2048, 5] 1,049,088
DecoderLayer-115 [-1, 5, 512] 0
MultiHeadAttention-116 [-1, 5, 512] 0
Linear-117 [-1, 5, 512] 262,656
Linear-118 [-1, 5, 512] 262,656
Linear-119 [-1, 5, 512] 262,656
MultiHeadAttention-120 [-1, 5, 512] 0
Linear-121 [-1, 5, 512] 262,656
Linear-122 [-1, 5, 512] 262,656
Linear-123 [-1, 5, 512] 262,656
PoswiseFeedForwardNet-124 [-1, 5, 512] 0
Conv1d-125 [-1, 512, 5] 1,050,624
Conv1d-126 [-1, 2048, 5] 1,049,088
Linear-127 [-1, 5, 512] 3,584
=======================================================================
Total params: 39,396,352
Trainable params: 39,390,208
Non-trainable params: 6,144
```

2) showing output shape

```python
# show output shape
summary(model, enc_inputs, dec_inputs, intputshow=False)

-----------------------------------------------------------------------
Layer (type) Output Shape Param #
=======================================================================
Embedding-1 [-1, 5, 512] 3,072
Embedding-2 [-1, 5, 512] 3,072
Linear-3 [-1, 5, 512] 262,656
Linear-4 [-1, 5, 512] 262,656
Linear-5 [-1, 5, 512] 262,656
MultiHeadAttention-6 [-1, 8, 5, 5] 0
Conv1d-7 [-1, 2048, 5] 1,050,624
Conv1d-8 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-9 [-1, 5, 512] 0
EncoderLayer-10 [-1, 8, 5, 5] 0
Linear-11 [-1, 5, 512] 262,656
Linear-12 [-1, 5, 512] 262,656
Linear-13 [-1, 5, 512] 262,656
MultiHeadAttention-14 [-1, 8, 5, 5] 0
Conv1d-15 [-1, 2048, 5] 1,050,624
Conv1d-16 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-17 [-1, 5, 512] 0
EncoderLayer-18 [-1, 8, 5, 5] 0
Linear-19 [-1, 5, 512] 262,656
Linear-20 [-1, 5, 512] 262,656
Linear-21 [-1, 5, 512] 262,656
MultiHeadAttention-22 [-1, 8, 5, 5] 0
Conv1d-23 [-1, 2048, 5] 1,050,624
Conv1d-24 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-25 [-1, 5, 512] 0
EncoderLayer-26 [-1, 8, 5, 5] 0
Linear-27 [-1, 5, 512] 262,656
Linear-28 [-1, 5, 512] 262,656
Linear-29 [-1, 5, 512] 262,656
MultiHeadAttention-30 [-1, 8, 5, 5] 0
Conv1d-31 [-1, 2048, 5] 1,050,624
Conv1d-32 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-33 [-1, 5, 512] 0
EncoderLayer-34 [-1, 8, 5, 5] 0
Linear-35 [-1, 5, 512] 262,656
Linear-36 [-1, 5, 512] 262,656
Linear-37 [-1, 5, 512] 262,656
MultiHeadAttention-38 [-1, 8, 5, 5] 0
Conv1d-39 [-1, 2048, 5] 1,050,624
Conv1d-40 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-41 [-1, 5, 512] 0
EncoderLayer-42 [-1, 8, 5, 5] 0
Linear-43 [-1, 5, 512] 262,656
Linear-44 [-1, 5, 512] 262,656
Linear-45 [-1, 5, 512] 262,656
MultiHeadAttention-46 [-1, 8, 5, 5] 0
Conv1d-47 [-1, 2048, 5] 1,050,624
Conv1d-48 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-49 [-1, 5, 512] 0
EncoderLayer-50 [-1, 8, 5, 5] 0
Encoder-51 [[-1, 8, 5, 5]] 0
Embedding-52 [-1, 5, 512] 3,584
Embedding-53 [-1, 5, 512] 3,072
Linear-54 [-1, 5, 512] 262,656
Linear-55 [-1, 5, 512] 262,656
Linear-56 [-1, 5, 512] 262,656
MultiHeadAttention-57 [-1, 8, 5, 5] 0
Linear-58 [-1, 5, 512] 262,656
Linear-59 [-1, 5, 512] 262,656
Linear-60 [-1, 5, 512] 262,656
MultiHeadAttention-61 [-1, 8, 5, 5] 0
Conv1d-62 [-1, 2048, 5] 1,050,624
Conv1d-63 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-64 [-1, 5, 512] 0
DecoderLayer-65 [-1, 8, 5, 5] 0
Linear-66 [-1, 5, 512] 262,656
Linear-67 [-1, 5, 512] 262,656
Linear-68 [-1, 5, 512] 262,656
MultiHeadAttention-69 [-1, 8, 5, 5] 0
Linear-70 [-1, 5, 512] 262,656
Linear-71 [-1, 5, 512] 262,656
Linear-72 [-1, 5, 512] 262,656
MultiHeadAttention-73 [-1, 8, 5, 5] 0
Conv1d-74 [-1, 2048, 5] 1,050,624
Conv1d-75 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-76 [-1, 5, 512] 0
DecoderLayer-77 [-1, 8, 5, 5] 0
Linear-78 [-1, 5, 512] 262,656
Linear-79 [-1, 5, 512] 262,656
Linear-80 [-1, 5, 512] 262,656
MultiHeadAttention-81 [-1, 8, 5, 5] 0
Linear-82 [-1, 5, 512] 262,656
Linear-83 [-1, 5, 512] 262,656
Linear-84 [-1, 5, 512] 262,656
MultiHeadAttention-85 [-1, 8, 5, 5] 0
Conv1d-86 [-1, 2048, 5] 1,050,624
Conv1d-87 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-88 [-1, 5, 512] 0
DecoderLayer-89 [-1, 8, 5, 5] 0
Linear-90 [-1, 5, 512] 262,656
Linear-91 [-1, 5, 512] 262,656
Linear-92 [-1, 5, 512] 262,656
MultiHeadAttention-93 [-1, 8, 5, 5] 0
Linear-94 [-1, 5, 512] 262,656
Linear-95 [-1, 5, 512] 262,656
Linear-96 [-1, 5, 512] 262,656
MultiHeadAttention-97 [-1, 8, 5, 5] 0
Conv1d-98 [-1, 2048, 5] 1,050,624
Conv1d-99 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-100 [-1, 5, 512] 0
DecoderLayer-101 [-1, 8, 5, 5] 0
Linear-102 [-1, 5, 512] 262,656
Linear-103 [-1, 5, 512] 262,656
Linear-104 [-1, 5, 512] 262,656
MultiHeadAttention-105 [-1, 8, 5, 5] 0
Linear-106 [-1, 5, 512] 262,656
Linear-107 [-1, 5, 512] 262,656
Linear-108 [-1, 5, 512] 262,656
MultiHeadAttention-109 [-1, 8, 5, 5] 0
Conv1d-110 [-1, 2048, 5] 1,050,624
Conv1d-111 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-112 [-1, 5, 512] 0
DecoderLayer-113 [-1, 8, 5, 5] 0
Linear-114 [-1, 5, 512] 262,656
Linear-115 [-1, 5, 512] 262,656
Linear-116 [-1, 5, 512] 262,656
MultiHeadAttention-117 [-1, 8, 5, 5] 0
Linear-118 [-1, 5, 512] 262,656
Linear-119 [-1, 5, 512] 262,656
Linear-120 [-1, 5, 512] 262,656
MultiHeadAttention-121 [-1, 8, 5, 5] 0
Conv1d-122 [-1, 2048, 5] 1,050,624
Conv1d-123 [-1, 512, 5] 1,049,088
PoswiseFeedForwardNet-124 [-1, 5, 512] 0
DecoderLayer-125 [-1, 8, 5, 5] 0
Decoder-126 [[-1, 8, 5, 5]] 0
Linear-127 [-1, 5, 7] 3,584
=======================================================================
Total params: 39,396,352
Trainable params: 39,390,208
Non-trainable params: 6,144
-----------------------------------------------------------------------
```

3) showing hierarchical summary

```python
Transformer(
(encoder): Encoder(
(src_emb): Embedding(6, 512), 3,072 params
(pos_emb): Embedding(6, 512), 3,072 params
(layers): ModuleList(
(0): EncoderLayer(
(enc_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 2,887,680 params
(1): EncoderLayer(
(enc_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 2,887,680 params
(2): EncoderLayer(
(enc_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 2,887,680 params
(3): EncoderLayer(
(enc_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 2,887,680 params
(4): EncoderLayer(
(enc_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 2,887,680 params
(5): EncoderLayer(
(enc_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 2,887,680 params
), 17,326,080 params
), 17,332,224 params
(decoder): Decoder(
(tgt_emb): Embedding(7, 512), 3,584 params
(pos_emb): Embedding(6, 512), 3,072 params
(layers): ModuleList(
(0): DecoderLayer(
(dec_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(dec_enc_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 3,675,648 params
(1): DecoderLayer(
(dec_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(dec_enc_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 3,675,648 params
(2): DecoderLayer(
(dec_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(dec_enc_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 3,675,648 params
(3): DecoderLayer(
(dec_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(dec_enc_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 3,675,648 params
(4): DecoderLayer(
(dec_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(dec_enc_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 3,675,648 params
(5): DecoderLayer(
(dec_self_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(dec_enc_attn): MultiHeadAttention(
(W_Q): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_K): Linear(in_features=512, out_features=512, bias=True), 262,656 params
(W_V): Linear(in_features=512, out_features=512, bias=True), 262,656 params
), 787,968 params
(pos_ffn): PoswiseFeedForwardNet(
(conv1): Conv1d(512, 2048, kernel_size=(1,), stride=(1,)), 1,050,624 params
(conv2): Conv1d(2048, 512, kernel_size=(1,), stride=(1,)), 1,049,088 params
), 2,099,712 params
), 3,675,648 params
), 22,053,888 params
), 22,060,544 params
(projection): Linear(in_features=512, out_features=7, bias=False), 3,584 params
), 39,396,352 params

```



## Reference

```python
code_reference = { 'https://github.com/pytorch/pytorch/issues/2001',
'https://gist.github.com/HTLife/b6640af9d6e7d765411f8aa9aa94b837',
'https://github.com/sksq96/pytorch-summary',
'Inspired by https://github.com/sksq96/pytorch-summary'}
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelsummary-1.0.9.tar.gz (10.2 kB view hashes)

Uploaded Source

Built Distributions

modelsummary-1.0.9-py3.6.egg (4.1 kB view hashes)

Uploaded Source

modelsummary-1.0.9-py3-none-any.whl (7.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page