Parameterized Activation Gating Framework for Flexible and Efficient Neural Networks
Project description
paGating: Parameterized Activation Gating Framework
๐ Production-Ready Framework for Parameterized Activation Gating in Neural Networks
A comprehensive, open-source framework that unifies gated activation functions through a single parameterization scheme. Featured in our IEEE TNNLS submission: "paGating: A Parameterized Activation Gating Framework for Flexible and Efficient Neural Networks for GenAI".
๐ฏ Key Results
Our framework demonstrates significant improvements across multiple domains:
| Domain | Metric | Improvement | Hardware |
|---|---|---|---|
| Language Modeling | WikiText-103 Eval Loss | 1.9% improvement | GPT-2 Small |
| Image Classification | CIFAR-10 Accuracy | +1.9 percentage points | ResNet variants |
| Hardware Efficiency | Apple M4 Inference | 3.11ร speedup | 15% memory reduction |
๐ Features
- ๐ฌ 7 Core Gating Units + Specialized Components: paGLU, paGTU, paSwishU, paReGLU, paGELU, paMishU, paSiLU, paUnit (template), PaGRUCell
- โก Production Ready: ONNX and CoreML export pipelines for deployment
- ๐งช Comprehensive Testing: 93% test coverage with continuous integration
- ๐ Benchmarking Tools: Built-in performance analysis and visualization
- ๐ PyTorch Lightning: Seamless integration with modern training workflows
- ๐ฑ Cross-Platform: CPU, CUDA, MPS (Apple Silicon) support
- ๐๏ธ Flexible Alpha: Fixed, learnable, or scheduled parameter control
Project Structure
The project has been organized into the following structure:
paGating/
โโโ assets/ # Static assets
โ โโโ images/ # Image files
โ โโโ figures/ # Paper figures
โ โโโ plots/ # Plot outputs from experiments
โโโ benchmark_results/ # Results from various benchmarks
โ โโโ coreml/ # CoreML benchmark results
โ โโโ regression/ # Regression task results
โ โโโ transformer/ # Transformer model results
โโโ coreml_models/ # Exported CoreML models
โโโ datamodules/ # PyTorch Lightning data modules
โโโ docs/ # Documentation
โ โโโ paper/ # Research paper and references
โ โโโ results_summary.md # Summary of experiment results
โโโ experiments/ # Experiment configurations
โโโ lightning_modules/ # PyTorch Lightning modules
โโโ models/ # Model implementations
โโโ onnx_models/ # Exported ONNX models
โโโ paGating/ # Core package
โ โโโ __init__.py # Package exports
โ โโโ base.py # Base classes
โ โโโ paGLU.py # Gated Linear Unit implementation
โ โโโ paGTU.py # Gated Tanh Unit implementation
โ โโโ paSwishU.py # Swish Unit implementation
โ โโโ paReGLU.py # ReLU Gated Linear Unit implementation
โ โโโ paGELU.py # GELU Gated Unit implementation
โ โโโ paMishU.py # Mish Unit implementation
โ โโโ paSiLU.py # SiLU/Swish gating implementation
โ โโโ paUnit.py # Generic gating unit template
โ โโโ paGRU.py # Parameterized GRU cell
โโโ scripts/ # Utility scripts
โ โโโ benchmark/ # Benchmarking scripts
โ โโโ utilities/ # Utility scripts
โโโ src/ # Source code (application-specific)
โโโ tests/ # Test suite
โโโ requirements.txt # Project dependencies
โโโ README.md # This file
Implemented Gating Units
| Unit | Description | Formula |
|---|---|---|
| paGLU | Parameterized Gated Linear Unit | x * (ฮฑ * sigmoid(x) + (1-ฮฑ)) |
| paGTU | Parameterized Gated Tanh Unit | x * (ฮฑ * tanh(x) + (1-ฮฑ)) |
| paSwishU | Parameterized Swish Unit | x * (ฮฑ * sigmoid(x) + (1-ฮฑ) * x) |
| paReGLU | Parameterized ReLU Gated Linear Unit | x * (ฮฑ * ReLU(x) + (1-ฮฑ)) |
| paGELU | Parameterized Gated GELU | x * (ฮฑ * GELU(x) + (1-ฮฑ)) |
| paMishU | Parameterized Mish Unit | x * (ฮฑ * mish(x) + (1-ฮฑ)) |
| paSiLU | Parameterized SiLU/Swish gating | x * (ฮฑ * SiLU(x) + (1-ฮฑ) * x) |
| paUnit | Generic Template for Custom Units | x * (ฮฑ * custom_fn(x) + (1-ฮฑ)) |
| PaGRUCell | Parameterized GRU Cell | Specialized recurrent architecture |
Installation
Clone the repository:
git clone https://github.com/guglxni/paGating.git
cd paGating
Install requirements:
pip install -r requirements.txt
Set up data directories and download datasets:
python scripts/download_data.py
Note: This repository uses symlinks for large data files. See docs/DATA_SETUP.md for detailed setup instructions.
Quick Start
Using a paGating unit in your model
import torch
from paGating import paGLU
# Create a layer with fixed alpha
gating_layer = paGLU(input_dim=512, output_dim=512, alpha=0.5)
# Or with learnable alpha
learnable_gating_layer = paGLU(input_dim=512, output_dim=512, learnable_alpha=True)
# Use in a model
x = torch.randn(32, 512) # batch_size, input_dim
output = gating_layer(x) # shape: (32, 512)
Integration with PyTorch models
import torch
import torch.nn as nn
from paGating import paGLU
class MyModel(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(784, 512)
self.gate = paGLU(512, 512, alpha=0.5) # paGating unit
self.fc2 = nn.Linear(512, 10)
def forward(self, x):
x = self.fc1(x)
x = self.gate(x)
x = self.fc2(x)
return x
Experimenting with paGating
Running Benchmarks
The framework includes tools for benchmarking different gating units:
python scripts/benchmark/benchmark_gateflow.py
This generates plots comparing the performance of different units.
Running a Hyperparameter Sweep
To compare different units and alpha values:
python scripts/utilities/run_experiment_pipeline.py --experiment_name my_experiment --units paGLU paGTU paMishU --alpha_values 0.0 0.2 0.5 0.8 1.0
This will:
- Run a hyperparameter sweep
- Generate a leaderboard
- Create visualizations
Testing with Transformer Models
To test a gating unit in a transformer for sequence classification:
python experiments/test_transformer.py --unit paMishU --alpha 0.5 --epochs 20
Export to CoreML
You can export trained models to CoreML format for deployment on Apple devices:
python scripts/coreml_export.py --unit paGLU --alpha 0.5
Test the exported model:
python tests/test_coreml_model.py --unit paGLU --alpha 0.5
Results Summary
For detailed results and comparisons of different gating units, see docs/results_summary.md.
Creating Your Own Gating Unit
To create a custom gating unit:
- Create a new file in the paGating directory (e.g.,
paGating/paMyCustomU.py) - Extend the
paGatingBaseclass - Implement the required methods
- Update
__init__.pyto expose your new unit
Example:
from .base import paGatingBase
import torch
import torch.nn as nn
import torch.nn.functional as F
class paMyCustomU(paGatingBase):
"""
My custom parameterized activation gating unit.
"""
def __init__(self, input_dim, output_dim, alpha=0.5, learnable_alpha=False, alpha_init=None, bias=True):
super().__init__(
input_dim=input_dim,
output_dim=output_dim,
alpha=alpha,
learnable_alpha=learnable_alpha,
alpha_init=alpha_init,
bias=bias
)
def compute_gate_activation(self, x):
# Implement your custom activation function
return my_custom_activation(x)
def forward(self, x):
# Standard implementation, can be customized if needed
x = self.linear(x)
gates = self.compute_gate_activation(x)
return x * gates
Then update __init__.py:
from .paMyCustomU import paMyCustomU
__all__ = [
# ... existing units
'paMyCustomU',
]
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Commercial Use: For commercial applications, please contact the authors for licensing arrangements.
๐ Research Paper
This framework is featured in our IEEE TNNLS submission:
"paGating: A Parameterized Activation Gating Framework for Flexible and Efficient Neural Networks for GenAI"
- Authors: Aaryan Guglani, Dr. Rajashree Shettar
- Institution: RV College of Engineering, Bengaluru
- Status: Under Review at IEEE Transactions on Neural Networks and Learning Systems
- Reproducibility: Complete reproduction guide available in
docs/REPRODUCIBILITY.md
๐ Documentation
- Reproducibility Guide: Step-by-step instructions to reproduce all paper results
- Contributing Guide: How to contribute to the project
- API Documentation: Detailed API reference and examples
๐ Citation
If you use paGating in your research, please cite:
@article{guglani2025pagating,
title={paGating: A Parameterized Activation Gating Framework for Flexible and Efficient Neural Networks for GenAI},
author={Guglani, Aaryan and Shettar, Rajashree},
journal={IEEE Transactions on Neural Networks and Learning Systems},
year={2025},
note={Under Review},
url={https://github.com/guglxni/paGating}
}
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Interactive Dashboard
The project includes a Streamlit dashboard for visualizing experiment results:
# Install required packages if not already installed
pip install streamlit plotly pandas
# Run the dashboard with a specific results directory
streamlit run scripts/streamlit_dashboard.py -- --results_dir results/your_experiment_dir
# Or run the dashboard and select the results directory in the UI
streamlit run scripts/streamlit_dashboard.py
Dashboard features:
- Compare performance across different gating units
- Analyze the effect of different alpha values
- Explore the behavior of learnable alpha parameters
- View training curves and leaderboards
- Generate insights and recommendations
Experiments
Run a hyperparameter sweep:
python scripts/utilities/run_experiment_pipeline.py
This will:
- Run a sweep over different units and alpha values
- Generate a leaderboard
- Create visualizations
- Run the analysis
Research Paper
A detailed research paper describing the paGating framework, its implementation, and experimental results is available in the docs/paper/ directory.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pagating-0.1.0.tar.gz.
File metadata
- Download URL: pagating-0.1.0.tar.gz
- Upload date:
- Size: 6.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f0efee1040c8bc14641777852e1b6b6078bdac4da256cec6bff415b203842e7
|
|
| MD5 |
ffd835c9ede32b976ecb6e28487e18ce
|
|
| BLAKE2b-256 |
d81a6186a14081ffe47c3084b3314f2a4a9b4caef3f193cb562dd319fc2e32eb
|
File details
Details for the file pagating-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pagating-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5510bb64624d0c8191fe40364b0d219dc0784b09c45505f63797db86de3ba778
|
|
| MD5 |
930fc48656808d5e6ff2f94d940b0692
|
|
| BLAKE2b-256 |
fdcf681a69a796a2383765290e17219fe240df6aab65ddbb072dffc3068601af
|