This package is designed to compute the theoretical amount of FLOPs(floating-point operations)、MACs(multiply-add operations) and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)，including any custom models via torch.nn.function.* as long as based on the Pytorch implementation.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

calculate-flops.pytorch

This tool(calflops) is designed to compute the theoretical amount of FLOPs(floating-point operations)、MACs(multiply-add operations) and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)，including any custom models via torch.nn.function.* as long as based on the Pytorch implementation.

In addition, the implementation process of this package inspired by ptflops and deepspeed libraries, for which I am very grateful for their great efforts, they are both very good work. Meanwhile this package also improves some aspects(more simple use、more model support) based on them.

Install the latest version

From PyPI:

pip install calflops

And you also can download latest calflops-*-py3-none-any.whl files from https://pypi.org/project/calflops/

pip install calflops-*-py3-none-any.whl

Example

from calflops import calculate_flops

# Deep Learning Model, such as alexnet.
from torchvision import models

model = models.alexnet()
batch_size = 1
flops, macs, params = calculate_flops(model=model, 
                                      input_shape=(batch_size, 3, 224, 224),
                                      output_as_string=True,
                                      output_precision=4)
print("alexnet FLOPs:%s   MACs:%s   Params:%s \n" %(flops, macs, params))
#alexnet FLOPs:1.4297 GFLOPS   MACs:714.188 MMACs   Params:61.1008 M 


# Transformers Model, such as bert.
from transformers import AutoModel
from transformers import AutoTokenizer
batch_size = 1
max_seq_length = 128
model_name = "hfl/chinese-roberta-wwm-ext/"
model_save = "../pretrain_models/" + model_name
model = AutoModel.from_pretrained(model_save)
tokenizer = AutoTokenizer.from_pretrained(model_save)
flops, macs, params = calculate_flops_pytorch(model=model, 
                                              input_shape=(batch_size, max_seq_length),
                                              transformer_tokenizer=tokenizer)
print("bert(hfl/chinese-roberta-wwm-ext) FLOPs:%s   MACs:%s   Params:%s \n" %(flops, macs, params))
#bert(hfl/chinese-roberta-wwm-ext) FLOPs:22.36 GFLOPS   MACs:11.17 GMACs   Params:102.27 M 


# Large Languase Model, such as llama2-7b.
from transformers import LlamaTokenizer
from transformers import LlamaForCausalLM
batch_size = 1
max_seq_length = 128
model_name = "llama2_hf_7B"
model_save = "../model/" + model_name
model = LlamaForCausalLM.from_pretrained(model_save)
tokenizer = LlamaTokenizer.from_pretrained(model_save)
flops, macs, params = calculate_flops(model=model,
                                      input_shape=(batch_size, max_seq_length),
                                      transformer_tokenizer=tokenizer)
print("llama2(7B) FLOPs:%s   MACs:%s   Params:%s \n" %(flops, macs, params))
#llama2(7B) FLOPs:1.7 TFLOPS   MACs:850.00 GMACs   Params:6.74 B

Common model calculate flops

large language model

Input data format: batch_size=1, seq_len=128

fwd: model forward propagation bwd: model backward propagation

Model	Input Shape	Params(B)	Params(Total)	fwd FLOPs(G)	fwd MACs(G)	fwd + bwd FLOPs(G)	fwd + bwd MACs(G)
baichuan-7B	(1, 128)	7B	7000559616	1733.62	866.78	5200.85	2600.33

calculate_flops API

def calculate_flops(model,
                    input_shape=None,
                    transformer_tokenizer=None,
                    args=[],   
                    kwargs={},
                    forward_mode="forward",
                    include_backPropagation=False,
                    compute_bp_factor=2.0,         
                    print_results=True,
                    print_detailed=True,
                    output_as_string=True,
                    output_precision=2,
                    output_unit=None,
                    ignore_modules=None):
    
    """Returns the total floating-point operations, MACs, and parameters of a model.

    Args:
        model ([torch.nn.Module]): The model of input must be a PyTorch model.
        input_shape (tuple, optional): Input shape to the model. If args and kwargs is empty, the model takes a tensor with this shape as the only positional argument. Default to [].
        transformers_tokenizer (None, optional): Transforemrs Toekenizer must be special if model type is transformers and args、kwargs is empty. Default to None
        args (list, optinal): list of positional arguments to the model, such as bert input args is [input_ids, token_type_ids, attention_mask]. Default to []
        kwargs (dict, optional): dictionary of keyword arguments to the model, such as bert input kwargs is {'input_ids': ..., 'token_type_ids':..., 'attention_mask':...}. Default to {}
        forward_mode (str, optional): To determine the mode of model inference, Default to 'forward'. And use 'generate' if model inference uses model.generate().
        include_backPropagation (bool, optional): Decides whether the final return FLOPs computation includes the computation for backpropagation.
        compute_bp_factor (float, optional): The model backpropagation is a multiple of the forward propagation computation. Default to 2.
        print_results (bool, optional): Whether to print the model profile. Defaults to True.
        print_detailed (bool, optional): Whether to print the detailed model profile. Defaults to True.
        output_as_string (bool, optional): Whether to print the output as string. Defaults to True.
        output_precision (int, optional) : Output holds the number of decimal places if output_as_string is True. Default to 2.
        output_unit (str, optional): The unit used to output the result value, such as T, G, M, and K. Default is None, that is the unit of the output decide on value.
        ignore_modules ([type], optional): the list of modules to ignore during profiling. Defaults to None.

Concact Author

Author: MrYXJ

Mail: code.mryxj@gmail.com

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.3.2

Jun 7, 2024

0.3.1

Jun 7, 2024

0.3.0

Jun 7, 2024

0.2.9

Sep 13, 2023

0.2.8

Sep 13, 2023

0.2.7

Sep 9, 2023

0.2.6

Sep 5, 2023

0.2.5

Sep 3, 2023

0.2.0

Aug 24, 2023

0.1.0

Aug 23, 2023

0.0.9

Aug 23, 2023

0.0.8

Aug 22, 2023

This version

0.0.7

Aug 22, 2023

0.0.6

Aug 22, 2023

0.0.5

Aug 22, 2023

0.0.4

Aug 21, 2023

0.0.3

Aug 21, 2023

0.0.2

Aug 21, 2023

0.0.1

Aug 21, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

calflops-0.0.7.tar.gz (15.8 kB view hashes)

Uploaded Aug 22, 2023 Source

Built Distribution

calflops-0.0.7-py3-none-any.whl (18.8 kB view hashes)

Uploaded Aug 22, 2023 Python 3

Hashes for calflops-0.0.7.tar.gz

Hashes for calflops-0.0.7.tar.gz
Algorithm	Hash digest
SHA256	`584ac3a97ebff4f926c4fd5c7c001c5cc118a1c4164ad5077b85304ea0888e73`
MD5	`8c24ebda4f94c1a1f61d8cb754393163`
BLAKE2b-256	`0a849da935eaa686c2978a608f50e212fc8e58785a8e6cb44c540dfe6d0630cd`

Hashes for calflops-0.0.7-py3-none-any.whl

Hashes for calflops-0.0.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9449ba17aeec15691e9ac875bda672bbb0e0c0215893856a08192f53835b867c`
MD5	`6fa907bbadb27604d6b62dbad674aff0`
BLAKE2b-256	`bf61ff8e99ae39a6e36c0c6457499cd780103a24d8066f85e0f4a75cbe908fb3`