Skip to main content

Parameterized Activation Gating Framework for Flexible and Efficient Neural Networks

Project description

paGating: Parameterized Activation Gating Framework

PyTorch License: Apache 2.0 Python Tests Paper

๐Ÿš€ Production-Ready Framework for Parameterized Activation Gating in Neural Networks

A comprehensive, open-source framework that unifies gated activation functions through a single parameterization scheme. Featured in our IEEE TNNLS submission: "paGating: A Parameterized Activation Gating Framework for Flexible and Efficient Neural Networks for GenAI".

๐ŸŽฏ Key Results

Our framework demonstrates significant improvements across multiple domains:

Domain Metric Improvement Hardware
Language Modeling WikiText-103 Eval Loss 1.9% improvement GPT-2 Small
Image Classification CIFAR-10 Accuracy +1.9 percentage points ResNet variants
Hardware Efficiency Apple M4 Inference 3.11ร— speedup 15% memory reduction

๐Ÿš€ Features

  • ๐Ÿ”ฌ 7 Core Gating Units + Specialized Components: paGLU, paGTU, paSwishU, paReGLU, paGELU, paMishU, paSiLU, paUnit (template), PaGRUCell
  • โšก Production Ready: ONNX and CoreML export pipelines for deployment
  • ๐Ÿงช Comprehensive Testing: 93% test coverage with continuous integration
  • ๐Ÿ“Š Benchmarking Tools: Built-in performance analysis and visualization
  • ๐Ÿ”„ PyTorch Lightning: Seamless integration with modern training workflows
  • ๐Ÿ“ฑ Cross-Platform: CPU, CUDA, MPS (Apple Silicon) support
  • ๐ŸŽ›๏ธ Flexible Alpha: Fixed, learnable, or scheduled parameter control

Project Structure

The project has been organized into the following structure:

paGating/
โ”œโ”€โ”€ assets/                  # Static assets
โ”‚   โ””โ”€โ”€ images/              # Image files
โ”‚       โ”œโ”€โ”€ figures/         # Paper figures
โ”‚       โ””โ”€โ”€ plots/           # Plot outputs from experiments
โ”œโ”€โ”€ benchmark_results/       # Results from various benchmarks
โ”‚   โ”œโ”€โ”€ coreml/              # CoreML benchmark results
โ”‚   โ”œโ”€โ”€ regression/          # Regression task results
โ”‚   โ””โ”€โ”€ transformer/         # Transformer model results
โ”œโ”€โ”€ coreml_models/           # Exported CoreML models
โ”œโ”€โ”€ datamodules/             # PyTorch Lightning data modules
โ”œโ”€โ”€ docs/                    # Documentation
โ”‚   โ”œโ”€โ”€ paper/               # Research paper and references
โ”‚   โ””โ”€โ”€ results_summary.md   # Summary of experiment results
โ”œโ”€โ”€ experiments/             # Experiment configurations
โ”œโ”€โ”€ lightning_modules/       # PyTorch Lightning modules
โ”œโ”€โ”€ models/                  # Model implementations
โ”œโ”€โ”€ onnx_models/             # Exported ONNX models
โ”œโ”€โ”€ paGating/                # Core package
โ”‚   โ”œโ”€โ”€ __init__.py          # Package exports
โ”‚   โ”œโ”€โ”€ base.py              # Base classes
โ”‚   โ”œโ”€โ”€ paGLU.py             # Gated Linear Unit implementation
โ”‚   โ”œโ”€โ”€ paGTU.py             # Gated Tanh Unit implementation
โ”‚   โ”œโ”€โ”€ paSwishU.py          # Swish Unit implementation
โ”‚   โ”œโ”€โ”€ paReGLU.py           # ReLU Gated Linear Unit implementation
โ”‚   โ”œโ”€โ”€ paGELU.py            # GELU Gated Unit implementation
โ”‚   โ”œโ”€โ”€ paMishU.py           # Mish Unit implementation
โ”‚   โ”œโ”€โ”€ paSiLU.py            # SiLU/Swish gating implementation
โ”‚   โ”œโ”€โ”€ paUnit.py            # Generic gating unit template
โ”‚   โ””โ”€โ”€ paGRU.py             # Parameterized GRU cell
โ”œโ”€โ”€ scripts/                 # Utility scripts
โ”‚   โ”œโ”€โ”€ benchmark/           # Benchmarking scripts
โ”‚   โ””โ”€โ”€ utilities/           # Utility scripts
โ”œโ”€โ”€ src/                     # Source code (application-specific)
โ”œโ”€โ”€ tests/                   # Test suite
โ”œโ”€โ”€ requirements.txt         # Project dependencies
โ””โ”€โ”€ README.md                # This file

Implemented Gating Units

Unit Description Formula
paGLU Parameterized Gated Linear Unit x * (ฮฑ * sigmoid(x) + (1-ฮฑ))
paGTU Parameterized Gated Tanh Unit x * (ฮฑ * tanh(x) + (1-ฮฑ))
paSwishU Parameterized Swish Unit x * (ฮฑ * sigmoid(x) + (1-ฮฑ) * x)
paReGLU Parameterized ReLU Gated Linear Unit x * (ฮฑ * ReLU(x) + (1-ฮฑ))
paGELU Parameterized Gated GELU x * (ฮฑ * GELU(x) + (1-ฮฑ))
paMishU Parameterized Mish Unit x * (ฮฑ * mish(x) + (1-ฮฑ))
paSiLU Parameterized SiLU/Swish gating x * (ฮฑ * SiLU(x) + (1-ฮฑ) * x)
paUnit Generic Template for Custom Units x * (ฮฑ * custom_fn(x) + (1-ฮฑ))
PaGRUCell Parameterized GRU Cell Specialized recurrent architecture

Installation

Clone the repository:

git clone https://github.com/guglxni/paGating.git
cd paGating

Install requirements:

pip install -r requirements.txt

Set up data directories and download datasets:

python scripts/download_data.py

Note: This repository uses symlinks for large data files. See docs/DATA_SETUP.md for detailed setup instructions.

Quick Start

Using a paGating unit in your model

import torch
from paGating import paGLU

# Create a layer with fixed alpha
gating_layer = paGLU(input_dim=512, output_dim=512, alpha=0.5)

# Or with learnable alpha
learnable_gating_layer = paGLU(input_dim=512, output_dim=512, learnable_alpha=True)

# Use in a model
x = torch.randn(32, 512)  # batch_size, input_dim
output = gating_layer(x)  # shape: (32, 512)

Integration with PyTorch models

import torch
import torch.nn as nn
from paGating import paGLU

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 512)
        self.gate = paGLU(512, 512, alpha=0.5)  # paGating unit
        self.fc2 = nn.Linear(512, 10)
        
    def forward(self, x):
        x = self.fc1(x)
        x = self.gate(x)
        x = self.fc2(x)
        return x

Experimenting with paGating

Running Benchmarks

The framework includes tools for benchmarking different gating units:

python scripts/benchmark/benchmark_gateflow.py

This generates plots comparing the performance of different units.

Running a Hyperparameter Sweep

To compare different units and alpha values:

python scripts/utilities/run_experiment_pipeline.py --experiment_name my_experiment --units paGLU paGTU paMishU --alpha_values 0.0 0.2 0.5 0.8 1.0

This will:

  1. Run a hyperparameter sweep
  2. Generate a leaderboard
  3. Create visualizations

Testing with Transformer Models

To test a gating unit in a transformer for sequence classification:

python experiments/test_transformer.py --unit paMishU --alpha 0.5 --epochs 20

Export to CoreML

You can export trained models to CoreML format for deployment on Apple devices:

python scripts/coreml_export.py --unit paGLU --alpha 0.5

Test the exported model:

python tests/test_coreml_model.py --unit paGLU --alpha 0.5

Results Summary

For detailed results and comparisons of different gating units, see docs/results_summary.md.

Creating Your Own Gating Unit

To create a custom gating unit:

  1. Create a new file in the paGating directory (e.g., paGating/paMyCustomU.py)
  2. Extend the paGatingBase class
  3. Implement the required methods
  4. Update __init__.py to expose your new unit

Example:

from .base import paGatingBase
import torch
import torch.nn as nn
import torch.nn.functional as F

class paMyCustomU(paGatingBase):
    """
    My custom parameterized activation gating unit.
    """
    
    def __init__(self, input_dim, output_dim, alpha=0.5, learnable_alpha=False, alpha_init=None, bias=True):
        super().__init__(
            input_dim=input_dim, 
            output_dim=output_dim, 
            alpha=alpha,
            learnable_alpha=learnable_alpha,
            alpha_init=alpha_init,
            bias=bias
        )
        
    def compute_gate_activation(self, x):
        # Implement your custom activation function
        return my_custom_activation(x)
        
    def forward(self, x):
        # Standard implementation, can be customized if needed
        x = self.linear(x)
        gates = self.compute_gate_activation(x)
        return x * gates

Then update __init__.py:

from .paMyCustomU import paMyCustomU

__all__ = [
    # ... existing units
    'paMyCustomU',
]

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Commercial Use: For commercial applications, please contact the authors for licensing arrangements.

๐Ÿ“„ Research Paper

This framework is featured in our IEEE TNNLS submission:

"paGating: A Parameterized Activation Gating Framework for Flexible and Efficient Neural Networks for GenAI"

  • Authors: Aaryan Guglani, Dr. Rajashree Shettar
  • Institution: RV College of Engineering, Bengaluru
  • Status: Under Review at IEEE Transactions on Neural Networks and Learning Systems
  • Reproducibility: Complete reproduction guide available in docs/REPRODUCIBILITY.md

๐Ÿ“š Documentation

๐Ÿ† Citation

If you use paGating in your research, please cite:

@article{guglani2025pagating,
  title={paGating: A Parameterized Activation Gating Framework for Flexible and Efficient Neural Networks for GenAI},
  author={Guglani, Aaryan and Shettar, Rajashree},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2025},
  note={Under Review},
  url={https://github.com/guglxni/paGating}
}

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Interactive Dashboard

The project includes a Streamlit dashboard for visualizing experiment results:

# Install required packages if not already installed
pip install streamlit plotly pandas

# Run the dashboard with a specific results directory
streamlit run scripts/streamlit_dashboard.py -- --results_dir results/your_experiment_dir

# Or run the dashboard and select the results directory in the UI
streamlit run scripts/streamlit_dashboard.py

Dashboard features:

  • Compare performance across different gating units
  • Analyze the effect of different alpha values
  • Explore the behavior of learnable alpha parameters
  • View training curves and leaderboards
  • Generate insights and recommendations

Experiments

Run a hyperparameter sweep:

python scripts/utilities/run_experiment_pipeline.py

This will:

  1. Run a sweep over different units and alpha values
  2. Generate a leaderboard
  3. Create visualizations
  4. Run the analysis

Research Paper

A detailed research paper describing the paGating framework, its implementation, and experimental results is available in the docs/paper/ directory.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pagating-0.1.0.tar.gz (6.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pagating-0.1.0-py3-none-any.whl (6.9 MB view details)

Uploaded Python 3

File details

Details for the file pagating-0.1.0.tar.gz.

File metadata

  • Download URL: pagating-0.1.0.tar.gz
  • Upload date:
  • Size: 6.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for pagating-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0f0efee1040c8bc14641777852e1b6b6078bdac4da256cec6bff415b203842e7
MD5 ffd835c9ede32b976ecb6e28487e18ce
BLAKE2b-256 d81a6186a14081ffe47c3084b3314f2a4a9b4caef3f193cb562dd319fc2e32eb

See more details on using hashes here.

File details

Details for the file pagating-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pagating-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for pagating-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5510bb64624d0c8191fe40364b0d219dc0784b09c45505f63797db86de3ba778
MD5 930fc48656808d5e6ff2f94d940b0692
BLAKE2b-256 fdcf681a69a796a2383765290e17219fe240df6aab65ddbb072dffc3068601af

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page