A benchmark for Generalized Windowed Operations in neural networks.
Project description
GWO Benchmark: The Architect's Arena
Is your neural network 'smart' or just big? This benchmark tells you the difference.
This Python package provides a framework for benchmarking neural network operations, inspired by the GWO (Generalized Windowed Operation) theory from the paper "Window is Everything: A Grammar for Neural Operations".
Instead of just measuring accuracy, this benchmark scores operations on their architectural efficiency. It quantifies the relationship between an operation's theoretical Operational Complexity (Ω_proxy) and its real-world performance, helping you design smarter, more efficient models.
Key Concepts in 1 Minute
The core idea is to break down any neural network operation (like Convolution or Self-Attention) into its fundamental building blocks and score its complexity.
-
GWO (Generalized Windowed Operation): A "grammar" that describes any operation using three components:
- Path (P): Where to look for information (e.g., a local sliding window).
- Shape (S): What form of information to look for (e.g., a square patch).
- Weight (W): What to value in that information (e.g., a learnable kernel).
-
Operational Complexity (
Ω_proxy): The "intelligence score" of your operation. A lower score for the same performance means a more efficient design. It's calculated as:Ω_proxy = C_D (Structural Complexity) + α * C_P (Parametric Complexity)C_D(Descriptive Complexity): How many basic "primitives" does it take to describe your operation's structure? (You define this based on our guide).C_P(Parametric Complexity): How many extra parameters are needed to generate the operation's behavior dynamically? (e.g., the offset prediction network in Deformable Convolution). This is calculated automatically.
Installation
pip install gwo-benchmark
Or for development from this repository:
git clone https://github.com/Kim-Ai-gpu/gwo-benchmark.git
cd gwo-benchmark
pip install -e .
Quick Start in 3 Steps
Let's benchmark a simple custom CNN on CIFAR-10.
Step 1: Define your model inheriting from GWOModule
Create your model file my_models.py:
# my_models.py
import torch.nn as nn
from gwo_benchmark import GWOModule
class MySimpleConv(GWOModule):
# PRIMITIVES: STATIC_SLIDING(1) + DENSE_SQUARE(1) + SHARED_KERNEL(1)
# Based on the official primitive guide, the complexity is 3.
C_D = 3
def __init__(self, in_channels=3, out_channels=16):
super().__init__()
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1)
self.relu = nn.ReLU()
def forward(self, x):
return self.relu(self.conv(x))
# This model has no dynamic components, so C_P is zero.
# We can omit get_parametric_complexity_modules() as it defaults to [].
Step 2: Create your benchmark script
Create your main script run_benchmark.py:
# run_benchmark.py
from gwo_benchmark import run, Evaluator
from my_models import MySimpleConv
# 1. Instantiate your model
model = MySimpleConv()
# 2. Configure the evaluation environment
# The standard Evaluator handles training and testing for you.
evaluator = Evaluator(
dataset_name="cifar10",
train_config={ "epochs": 2, "batch_size": 64 }
)
# 3. Run the benchmark!
if __name__ == "__main__":
result = run(model, evaluator, result_dir="benchmark_results")
print(result)
Step 3: Run from your terminal
python run_benchmark.py
You'll see a detailed analysis of your model's complexity and performance, saved in the benchmark_results directory.
How It Works
The framework is designed for flexibility and extension.
-
GWOModule(gwo_benchmark.base.GWOModule): The heart of your submission. You must inherit from this abstract class and implement:C_D(property): Your calculation of the Descriptive Complexity.get_parametric_complexity_modules()(method): A list ofnn.Modules that contribute toC_P.
-
Evaluator(gwo_benchmark.evaluator.BaseEvaluator): This class encapsulates all evaluation logic (training, testing, performance measurement).- Use the built-in
Evaluatorfor standard datasets like CIFAR-10. - Create your own custom evaluation loop by inheriting from
BaseEvaluatorfor specialized tasks.
- Use the built-in
-
Datasets (
gwo_benchmark.datasets): Easily add support for new datasets by inheriting fromBaseDatasetand registering your class. See thedatasetsdirectory for examples.
Contributing
We welcome contributions! This project is in its early stages, and we believe it can grow into a standard tool for the deep learning community.
- Add New GWO Models: Implement novel or existing operations (like Transformers, Attention variants, MLPs) as
GWOModules in theexamplesdirectory. - Support More Datasets: Help us expand the benchmark to new domains like NLP, Graphs, etc.
- Improve the Core Engine: Enhance the
Evaluator,ComplexityCalculator, or add new analysis tools.
Please see our CONTRIBUTING.md for more details.
Running Tests
To ensure the integrity of the framework, please run tests before submitting a pull request.
python -m unittest discover tests
Citation
If you use this framework in your research, please consider citing the original paper: @article{https://doi.org/10.5281/zenodo.17103133, doi = {10.5281/ZENODO.17103133}, url = {https://zenodo.org/doi/10.5281/zenodo.17103133}, author = {Kim, Youngseong}, keywords = {Machine learning, Machine Learning, Supervised Machine Learning, Machine Learning/classification, Machine Learning/ethics, Machine Learning/standards, Unsupervised Machine Learning, Machine Learning/history, Machine Learning/trends, Machine Learning/economics, Supervised Machine Learning/standards, Unsupervised Machine Learning/classification}, language = {en}, title = {Window is Everything: A Grammar for Neural Operations}, publisher = {Zenodo}, year = {2025}, copyright = {Creative Commons Attribution 4.0 International}}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gwo_benchmark-0.2.5.tar.gz.
File metadata
- Download URL: gwo_benchmark-0.2.5.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95758772c91237af6708163967a5bec9e4c23353943ec4bea881af21aaf95f7b
|
|
| MD5 |
889a4f99058243a984ed1302841abfe6
|
|
| BLAKE2b-256 |
c536aa1e50c68b457b7f66060495c7e83cca520c03787defe4b4e5b0e23fc6e5
|
File details
Details for the file gwo_benchmark-0.2.5-py3-none-any.whl.
File metadata
- Download URL: gwo_benchmark-0.2.5-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0bad42497208831f1c5a99f3ed45487195880f8d547b7c34c0f090f60f71b080
|
|
| MD5 |
1f83fabfe81cc0ebb016dc66b7c5feeb
|
|
| BLAKE2b-256 |
3f802579c9ce7898866343cb5098dff0b99242dc40e57efb42bd64b4f389b359
|