Skip to main content

Structured Pruning Adapters for PyTorch

Project description

Structured Pruning Adapters for PyTorch

pip install structured-pruning-adapters

A happy mariage 👰‍♀️🤵‍♂️

Pruning is an effective method for reducing the size of neural networks. Besides reducing the parameter count, the process can accelerate inference as well. CPUs can handle sparse weights just fine, but GPUs need structured weights for an acceleration to happen. A structured approach to pruning i.e., removing network channels [paper] or blocks of weights [paper], generally yields speedups as well

+

Adapters [paper] have emerged as an alternative to fine-tuning, in which the prior network weights are unaltered, and a new set of adapters weights are added to the network to learn a specific task. Some types of adapters add new layers, others are fusible with existing weights and don't have a run-time overhead. When a single base-model is deployed with many specialised models, these structures can save a lot of parameters compared with full fine-tuning.

=

Structured Pruning Adapters are the offspring of Structured Pruning and Fusible Adapters, and can be used for Transfer Learning which has:

  • ✅ Extremely few learned parameters (binary pruning mask + masked adapter weights) 👌
  • ✅ Accelerated network inference 🏎💨

How to use this library

Use in conjunction with any Structured Pruning technique.

  1. Install the library:

    pip install structured-pruning-adapters
    
  2. Replace Linear and Conv layers with an SP Adapter:

    from torch.nn import Linear
    from sp_adapter import SPLoRA
    
    reg_lin = Linear(256, 512, bias=True)
    spa_lin = SPLoRA(reg_lin, rank=32)
    
    # Or replace all applicable layers in a network
    spa_net = SPLoRA(reg_net, rank=32)
    
  3. Employ any Structured Pruning method. We conducted extensive experimens with multiple channel-pruning and block-pruning methods.

  4. Get pruned SP Adapter weights:

    # Specify mask - learned via your choice of Structured Pruning method
    in_features_mask=torch.tensor([1, 0, ..., 1], dtype=torch.bool)
    out_features_mask=torch.tensor([0, 1, ..., 1], dtype=torch.bool)
    
    # Read parameters
    params = sp_adapters.splora.parameters(
        adapter_weights_only=True,
        in_features_mask=torch.tensor([1, 0, ..., 1], dtype=torch.bool)
        out_features_mask=torch.tensor([0, 1, ..., 1], dtype=torch.bool),
    )   
    named_parameters = sp_adapters.splora.named_parameters(
        adapter_weights_only=True,
        in_features_mask=torch.tensor([1, 0, ..., 1], dtype=torch.bool)
        out_features_mask=torch.tensor([0, 1, ..., 1], dtype=torch.bool),
    )
    

Demo

See also notebooks/demo.ipynb for a hands-on demo.

Structured Pruning Low-Rank Adapter (SPLoRA) for Channel Pruning

from sp_adapters import SPLoRA
Adds a low-rank bottle-neck projection in projection in parallel with the main weights projection.

Structured Pruning Parllel Residual Adapter (SPPaRA) for Channel Pruning of CNNs

from sp_adapters import SPPaRA

Adds a pointwise convolution as adapter to convolutional layers. First proposed in "Efficient parametrization of multi-domain deep neural networks" by Rebuffi et al.,


Structured Pruning Low-rank PHM Adapter (SPLoPA) for Block Pruning (experimental)

from sp_adapters import SPLoPA

Uses a variation on the Parameterized Hypercomplex Multiplication (PHM) layer [paper] with shared low-rank prototypes for block-sparse adaptation.

Citation

If you enjoy this work, please consider citing it

@article{hedegaard2022structured,
  title={Structured Pruning Adapters},
  author={Lukas Hedegaard and Aman Alok and Juby Jose and Alexandros Iosifidis},
  journal={preprint, arXiv:2211.10155},
  year={2022}
}

Acknowledgement

This work was done in conjunction with a research exchange at Cactus Communications 🌵.

This work has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 871449 (OpenDR) 🇪🇺.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

structured-pruning-adapters-0.7.1.tar.gz (28.8 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file structured-pruning-adapters-0.7.1.tar.gz.

File metadata

File hashes

Hashes for structured-pruning-adapters-0.7.1.tar.gz
Algorithm Hash digest
SHA256 a1880c529c579f751d293fd00955539ffdc676eb62f0e6adfbaefea8f9298d24
MD5 3be89fd9bd5f8e89897bd7bccc0e42eb
BLAKE2b-256 18cf96303f38417aef71a7d137e447f083ff6649c630f9908265db700073050d

See more details on using hashes here.

File details

Details for the file structured_pruning_adapters-0.7.1-py3-none-any.whl.

File metadata

File hashes

Hashes for structured_pruning_adapters-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7d4cef2decc411a1f797d02a1df34475204ddd0c95ca1c6a708b628f65e9e0bb
MD5 7ead672f1ab13a77f4f2559243579dc7
BLAKE2b-256 a7e977bd5114aeb85099c8862d72cbe9f0732bcaf8337a7fdd8aef13037d173c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page