A hardware-agnostic python library to monitor the computational cost of Machine and Deep Learning algorithms.
Project description
FLOPpy: A hardware-agnostic Python library to monitor the computational cost of Machine and Deep Learning algorithms
FLOPpy is a versatile Python library designed to monitor and estimate the algorithmic workload of both Deep Learning (PyTorch) and Machine Learning (Scikit-learn) models.
By systematically tracking Floating Point Operations (FLOPs) and BOPs (Bit-OPerations), it provides a hardware-independent assessment of the total computational demand, spanning from standard Forward and Backward passes to Optimizer updates and Loss evaluations.
🚀 Key Features
- Hardware-Agnostic Monitoring: Provides a standardized measure of computational demand that does not depend on specific hardware characteristics or infrastructure;
- Cross-Framework Support: Seamlessly profile models from
torch(includingHugging Facemodels) andscikit-learnusing a unified API; - Modular Architecture: Designed with a provider pattern and structural decoupling, allowing easy extension to other backends;
- Full Pipeline Tracking: Go beyond simple inference, monitor the cost of training (Backward pass), Loss computation, Optimizer steps, and even pre-processing operations like tokenization;
- Transparent Integration: Zero-boilerplate integration via a non-intrusive, hook-based architecture and safe monkey-patching;
- The "Escape Hatch": Native support for tracking quantized layers (e.g., 4-bit, 8-bit) and fused/custom optimizers (BitsAndBytes, Apex, DeepSpeed) that typically bypass standard profilers;
- Reproducibility: Unlike execution time or energy metrics, FLOPs and BOPs reflect the intrinsic complexity of an algorithm, ensuring consistent results across different systems;
- Real-time Integration: Supports seamless synchronization with Weights & Biases (WandB) for real-time visualization.
📊 Why FLOPpy?
In an era of large-scale models and specialized hardware, execution time is no longer a sufficient metric for efficiency. FLOPpy allows researchers and developers to:
- Compare the efficiency of different architectures regardless of the GPU/CPU used;
- Quantify the real computational savings of quantization (FP16 vs INT8 vs INT4);
- Identify bottlenecks in the training loop, including the often-overlooked optimizer overhead.
📦 Installation
pip install floppy-tracker
Dependencies
The library requires the following environment and tools:
- Python: Core language.
- NumPy: Used for multidimensional array manipulation and analytical complexity formulas.
- Scikit-Learn: Supported for monitoring classical machine learning algorithms.
- PyTorch: Supported for deep learning tracking via high-level hooks and low-level ATen dispatching.
- psutil: Essential for capturing detailed hardware snapshots, including CPU cores, RAM, and system usage.
- Wandb: Used for real-time visualization and remote experiment tracking.
📖 Usage
Integration is transparent and does not require modifications to the model implementation.
PyTorch / Hugging Face Example
import torch.nn as nn
from floppy import FLOPpyTracker, WandbConfiguration
from transformers import AutoModel
wandb_config = WandbConfiguration(
project_name="your_experiment",
group_name="your_group",
reporter_key="your_wandb_key_here"
)
# 1. Define your model, loss and optimizer
model = nn.Sequential(nn.Linear(10, 10), nn.ReLU())
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())
num_epochs = 10
# 2. Initialize the tracker
tracker = FLOPpyTracker(run_name="pytorch_experiment")
# 3. Run monitoring
tracker.run(model=model, optimizer=optimizer, loss_fn=loss_fn)
# 4. Do something with the model
for _ in range(num_epochs):
for xb, yb in your_data_loader:
optimizer.zero_grad()
y_hat = model(xb)
loss = loss_fn(y_hat, yb)
loss.backward()
optimizer.step()
tracker.batch()
tracker.epoch()
# 5. Access the report
report = tracker.report()
print(report)
Scikit-learn Example
from sklearn.ensemble import RandomForestClassifier
from floppy import FLOPpyTracker
# 1. Define your model
model = RandomForestClassifier(n_estimators=100)
# 2. Initialize the tracker
tracker = FLOPpyTracker(run_name="sklearn_test")
# 3. Run monitoring
tracker.run(model=model)
# 4. Do something with the model
model.fit(X_train, y_train)
preds = model.predict(X_test)
# 5. Access the report
report = tracker.report(print_summary=True)
🔬 Methodology
🛠️ Computational Strategy & Backends
FLOPpy employs high-precision, transparent strategies across different frameworks to ensure maximum accuracy without requiring any changes to the user's original code.
PyTorch: Unified Dispatch & Patching
The library avoids the overhead and limitations of traditional per-module hooks by operating directly at the functional and tensor level:
- Root Hooks & Low-Level Dispatching: Instead of attaching hooks to every single sub-module, FLOPpy attaches a single boundary hook to the root model. Inside this forward pass, it deploys
TorchDispatchModevia theUniversalFlopCounterto intercept underlying C++ ATen dispatch calls in real-time. This captures all mathematical operations, including those occurring outside of standardnn.Moduleobjects, such as residual skip connections and element-wise tensor manipulations; - Transparent Backward Tracking: Implements safe monkey-patching of
torch.Tensor.backward. This encapsulates the entire Autograd graph execution within a tracking context, overcoming the well-known architectural limitations of standard PyTorch backward hooks on container modules (e.g.,nn.Sequential); - Optimizer & Loss Hooks: Utilizes targeted
TorchTrainingHooksto interceptoptimizer.step()calls and loss function evaluations. It features a specialized "Escape Hatch" fallback logic to accurately estimate the workload of fused or quantized optimizers (e.g., BitsAndBytes, Apex, DeepSpeed) that execute custom C++/CUDA kernels and bypass the standard PyTorch dispatcher.
Scikit-Learn: Dynamic API Wrapping
The SklearnBackend implements a non-intrusive method-wrapping strategy to seamlessly support classical Machine Learning workflows:
- Method Interception: Automatically wraps standard API methods—
fit(),predict(), andtransform()—to extract input and output array dimensions at runtime. - Semantic Mapping: Intelligently maps execution phases to ensure report consistency across both Deep Learning and Machine Learning frameworks:
fit()operations are reported as Model (Backward) to represent the training and weight-update phase;predict()andtransform()operations are reported as Model (Forward) to represent the inference phase;
- Algorithmic Complexity: Applies targeted mathematical complexity formulas (e.g., $O(n_{trees} \cdot n_{samples} \cdot \log_2(n_{samples}))$ for Random Forests) based on array shapes and data types to provide accurate, hardware-independent workload and BOPs estimates.
📊 Detailed Reporting
The FLOPpyReport object provides a detailed, phase-aware breakdown of the computational workload:
model_forward_flops&model_forward_bops: The algorithmic cost and precision-aware hardware effort (Bit-Operations) of the forward pass. In Scikit-learn workflows, this maps to inference methods likepredict()andtransform();model_backward_flops&model_backward_bops: The computational workload required for the training phase. This captures the Autograd gradient calculation in Deep Learning, or thefit()method in classical Machine Learning;loss_forward_flops&loss_forward_bops: The operations and actual hardware effort explicitly tied to evaluating the loss function;optimizer_flops&optimizer_bops: The computational overhead of the optimization step (e.g., weight updates, momentum). It accounts for the specific bit-width used, accurately tracking even fused or quantized optimizers (e.g., 8-bit Adam) via the built-in Escape Hatch;preproc_ops: Workload from input preparation, such as tokenizer operations for Large Language Models;System Environment: A detailed snapshot of the execution context, including CPU/GPU specifications, RAM, OS, and active library versions (e.g., PyTorch, Scikit-learn).
✍️ Authors & Citation
Francesco Scala, Francesco Mandarino, Liliana Martirano, and Luigi Pontieri. Institute of High Performance Computing and Networking (ICAR-CNR) & University of Calabria, Italy.
If you use FLOPpy in your research, please cite:
Coming soon...
📄 License
This software is licensed under the GNU Public License v3.0 (GPL3).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file floppy_tracker-0.0.6.tar.gz.
File metadata
- Download URL: floppy_tracker-0.0.6.tar.gz
- Upload date:
- Size: 53.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5222ba190da5bd178462ab300d8c53c27696b555ede647c269303e8a56e56726
|
|
| MD5 |
91530fd68da67baa03989603c2aa7132
|
|
| BLAKE2b-256 |
32df7c10568bc8d2640a889e0c0e6efea95c252a62862250cfeea77f417bafb4
|
Provenance
The following attestation bundles were made for floppy_tracker-0.0.6.tar.gz:
Publisher:
publish.yml on Franco7Scala/FLOPpy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
floppy_tracker-0.0.6.tar.gz -
Subject digest:
5222ba190da5bd178462ab300d8c53c27696b555ede647c269303e8a56e56726 - Sigstore transparency entry: 1215922725
- Sigstore integration time:
-
Permalink:
Franco7Scala/FLOPpy@14b351338a08cf9f1d9ff00fd959763cf17dd68f -
Branch / Tag:
refs/tags/v0.0.6 - Owner: https://github.com/Franco7Scala
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@14b351338a08cf9f1d9ff00fd959763cf17dd68f -
Trigger Event:
release
-
Statement type:
File details
Details for the file floppy_tracker-0.0.6-py3-none-any.whl.
File metadata
- Download URL: floppy_tracker-0.0.6-py3-none-any.whl
- Upload date:
- Size: 44.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5491c6e72c6dc4b59d2a7ba62aad8d3db2d1444dbf90488cc7e21fa2fd533b52
|
|
| MD5 |
48d3426a4871aa7c76a71ad37c380bf5
|
|
| BLAKE2b-256 |
20a9591e5f1dbfae686f225ef25fc1b4f690beee995cfe71a671b8f86851ba56
|
Provenance
The following attestation bundles were made for floppy_tracker-0.0.6-py3-none-any.whl:
Publisher:
publish.yml on Franco7Scala/FLOPpy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
floppy_tracker-0.0.6-py3-none-any.whl -
Subject digest:
5491c6e72c6dc4b59d2a7ba62aad8d3db2d1444dbf90488cc7e21fa2fd533b52 - Sigstore transparency entry: 1215922835
- Sigstore integration time:
-
Permalink:
Franco7Scala/FLOPpy@14b351338a08cf9f1d9ff00fd959763cf17dd68f -
Branch / Tag:
refs/tags/v0.0.6 - Owner: https://github.com/Franco7Scala
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@14b351338a08cf9f1d9ff00fd959763cf17dd68f -
Trigger Event:
release
-
Statement type: