torch benchmarking tool

Project description

PyTorch Model Benchmarking Tool

This tool provides a comprehensive set of utilities for benchmarking PyTorch models, including performance metrics, memory usage, and model statistics.

Features

Measure inference latency on both CPU and GPU
Track GPU memory usage
Calculate model size and number of parameters
Compute MACs (Multiply-Accumulate operations)
Calculate model sparsity
Generate visualizations of parameter distributions and weight distributions
Provide formatted output of benchmark results

Installation

Ensure you have PyTorch and the following dependencies installed:

pip install torch pynvml matplotlib numpy colorama torchprofile

Example

import torch 
from torchvision.models import resnet50, ResNet50_Weights
from torch_benchmark import benchmark

# Load model and example input
model = resnet50(weights=ResNet50_Weights.DEFAULT)
example_input = torch.randn(1, 3, 224, 224)

# Run benchmark 
results = benchmark(model, example_input)

You can run example.py to see the output in your terminal and play with the different functions.

Advanced Usage

Tracking gpu memory for a torch model

from torch_benchmark import track_gpu_memory

with track_gpu_memory():
    # Your GPU operations here
    pass

max_memory = track_gpu_memory.max_memory
current_memory = track_gpu_memory.current_memory
print(f"Max GPU memory used: {max_memory:.2f} MB")
print(f"Current GPU memory used: {current_memory:.2f} MB")

Getting info about GPU memory

from torch_benchmark import detailed_memory_info

detailed_memory_info()

Calculating model sparsity

from torch_benchmark import get_model_sparsity, get_layer_sparsity

sparsity = get_model_sparsity(model)
print(f"Model sparsity: {sparsity:.2f}")

get_layer_sparsity(model)

Visualizations

When plot=True is set in the benchmark function, two plots will be generated:

num_parameters_distribution.png: Bar chart showing the number of parameters in each layer.
weight_distribution.png: Histograms of weight distributions for each layer.

These plots can provide insights into the model's architecture and weight patterns.

Notes

Ensure you have a CUDA-capable GPU for GPU benchmarking.
The tool uses CUDA events for precise GPU timing.
Memory usage is tracked using PyNVML.
MACs calculation requires the torchprofile package.

Contributing

This project started as a personal tool to simplify the process of benchmarking models on EdgeAI resources. It's designed to be a lightweight, easy-to-use solution that can be quickly installed and utilized.

While this is primarily a personal project, I'm open to suggestions and improvements. If you have ideas or find any issues, feel free to:

Open an issue on the GitHub repository to report bugs or suggest enhancements.
Submit pull requests for minor fixes or improvements..

If you find this tool helpful, feel free to star the repository or share it with others who might benefit from it. Thanks for your interest!

API Reference

Main Function

`benchmark(model, dummy_input, n_warmup=50, n_test=200, plot=False)`

Runs a comprehensive benchmark on the given model.

Parameters:
- model: PyTorch model to benchmark
- dummy_input: A tensor matching the input shape expected by the model
- n_warmup: Number of warm-up iterations (default: 50)
- n_test: Number of test iterations (default: 200)
- plot: If True, generates plots for parameter and weight distributions (default: False)
Returns: A dictionary containing benchmark results

Utility Functions

`measure_latency_cpu(model, dummy_input, n_warmup=50, n_test=200)`

Measures the inference time of the model on CPU.

Parameters: Same as benchmark
Returns: mean_syn (in ms), std_syn (in ms), fps

`measure_latency_gpu(model, dummy_input, n_warmup=50, n_test=200)`

Measures the inference time of the model on GPU.

Parameters: Same as benchmark
Returns: mean_syn (in ms), std_syn (in ms), fps

`get_model_macs(model, inputs) -> int`

Returns the number of multiply-accumulate operations for the given model and inputs.

Parameters:
- model: PyTorch model
- inputs: Input tensor
Returns: Number of MACs

`get_sparsity(tensor: torch.Tensor) -> float`

Calculates the sparsity of the given tensor.

Parameters: tensor: PyTorch tensor
Returns: Sparsity value (float)

`get_layer_sparsity(model: nn.Module)`

Prints the sparsity for each layer in the model.

Parameters: model: PyTorch model

`get_model_sparsity(model: nn.Module) -> float`

Calculates the overall sparsity of the given model.

Parameters: model: PyTorch model
Returns: Model sparsity (float)

`get_num_parameters(model: nn.Module, count_nonzero_only=False) -> int`

Calculates the total number of parameters in the model.

Parameters:
- model: PyTorch model
- count_nonzero_only: If True, only counts non-zero parameters (default: False)
Returns: Number of parameters

`get_model_size(model: nn.Module, data_width=32, count_nonzero_only=False) -> int`

Calculates the model size in bits.

Parameters:
- model: PyTorch model
- data_width: Number of bits per element (default: 32)
- count_nonzero_only: If True, only counts non-zero parameters (default: False)
Returns: Model size in bits

`plot_num_parameters_distribution(model)`

Plots the distribution of the number of parameters per layer.

Parameters: model: PyTorch model

`plot_weight_distribution(model, bins=256, count_nonzero_only=False)`

Plots the distribution of the weights for each layer.

Parameters:
- model: PyTorch model
- bins: Number of histogram bins (default: 256)
- count_nonzero_only: If True, only plots non-zero weights (default: False)

Context Managers

`track_gpu_memory()`

Context manager to track GPU memory usage during inference.

Usage:

with track_gpu_memory():
    # Your GPU operations here
max_memory = track_gpu_memory.max_memory
current_memory = track_gpu_memory.current_memory

Project details

Release history Release notifications | RSS feed

0.1.4

Mar 26, 2025

0.1.3

Mar 26, 2025

0.1.2

Nov 15, 2024

0.1.1

Nov 2, 2024

This version

0.1

Jul 18, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytorch_bench-0.1.tar.gz (6.0 kB view details)

Uploaded Jul 18, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pytorch_bench-0.1-py3-none-any.whl (6.3 kB view details)

Uploaded Jul 18, 2024 Python 3

File details

Details for the file pytorch_bench-0.1.tar.gz.

File metadata

Download URL: pytorch_bench-0.1.tar.gz
Upload date: Jul 18, 2024
Size: 6.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for pytorch_bench-0.1.tar.gz
Algorithm	Hash digest
SHA256	`5f2c72202dc8af175b43a6b150c98ce67e09d4cebb28c901f98a15e1af3b8601`
MD5	`3dd8181edac3e025debc102f8a6bd56c`
BLAKE2b-256	`6907bc1006ca25b7c98d14b34a44e3eafaa3f45188763accf8394592ea08e175`

See more details on using hashes here.

File details

Details for the file pytorch_bench-0.1-py3-none-any.whl.

File metadata

Download URL: pytorch_bench-0.1-py3-none-any.whl
Upload date: Jul 18, 2024
Size: 6.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for pytorch_bench-0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6d3c840c1e7215ab892e88366de23ea207913ad7fa0129e8499f209c618c55d1`
MD5	`7bbadf49ce60bbeec8954a11f9c36312`
BLAKE2b-256	`6722b4bf0eba18e17fd1f9ec4b34f7f1b4113ef7cf0832902541acf1bdc3d4f8`

See more details on using hashes here.

pytorch-bench 0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

PyTorch Model Benchmarking Tool

Features

Installation

Example

Advanced Usage

Tracking gpu memory for a torch model

Getting info about GPU memory

Calculating model sparsity

Visualizations

Notes

Contributing

API Reference

Main Function

benchmark(model, dummy_input, n_warmup=50, n_test=200, plot=False)

Utility Functions

measure_latency_cpu(model, dummy_input, n_warmup=50, n_test=200)

measure_latency_gpu(model, dummy_input, n_warmup=50, n_test=200)

get_model_macs(model, inputs) -> int

get_sparsity(tensor: torch.Tensor) -> float

get_layer_sparsity(model: nn.Module)

get_model_sparsity(model: nn.Module) -> float

get_num_parameters(model: nn.Module, count_nonzero_only=False) -> int

get_model_size(model: nn.Module, data_width=32, count_nonzero_only=False) -> int

plot_num_parameters_distribution(model)

plot_weight_distribution(model, bins=256, count_nonzero_only=False)

Context Managers

track_gpu_memory()

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`benchmark(model, dummy_input, n_warmup=50, n_test=200, plot=False)`

`measure_latency_cpu(model, dummy_input, n_warmup=50, n_test=200)`

`measure_latency_gpu(model, dummy_input, n_warmup=50, n_test=200)`

`get_model_macs(model, inputs) -> int`

`get_sparsity(tensor: torch.Tensor) -> float`

`get_layer_sparsity(model: nn.Module)`

`get_model_sparsity(model: nn.Module) -> float`

`get_num_parameters(model: nn.Module, count_nonzero_only=False) -> int`

`get_model_size(model: nn.Module, data_width=32, count_nonzero_only=False) -> int`

`plot_num_parameters_distribution(model)`

`plot_weight_distribution(model, bins=256, count_nonzero_only=False)`

`track_gpu_memory()`