Package containing utilities for implementing RSN2/MANN
Project description
MANN
MANN, which stands for Multitask Artificial Neural Networks, is a Python package which enables creating sparse multitask models compatible with TensorFlow. This package contains custom layers and utilities to facilitate the training and optimization of models using the Reduction of Sub-Network Neuroplasticity (RSN2) training procedure developed by AI Squared, Inc.
Changing to BeyondML
Please be advised that the MANN package is going to be deprecated after version 0.3.0 and will be moved to the beyondml
package in PyPi. For future releases, please alter your code to utilize the beyondml
package.
Installation
This package is available through PyPi and can be installed via the following command:
pip install mann
To install the current version directly from GitHub without cloning, run the following command:
pip install git+https://github.com/AISquaredInc/mann.git
Alternatively, you can install the package by cloning the repository from GitHub using the following commands:
# clone the repository and cd into it
git clone https://github.com/AISquaredInc/mann
cd mann
# install the package
pip install .
Mac M1 Users
For those with a Mac with the M1 processor, this package can be installed, but the standard version of TensorFlow is not compatible with the M1 SOC. In order to install a compatible version of TensorFlow, please install the Miniforge conda environment, which utilizes the conda-forge channel only. Once you are using Miniforge, using conda to install TensorFlow in that environment should install the correct version. After installing TensorFlow, the command pip install mann
will install the MANN package.
Contributing
For those who are interested in contributing to this project, we first thank you for your interest! Please refer to the CONTRIBUTING.md file in this repository for information about best practices for how to contribute.
Vulnerability reporting
In the event you notice a vulnerability within this project, please open a GitHub Issue detailing the vulnerability to report it. In the event you would like to keep the report private, please email mann@squared.ai.
Capabilities
The MANN package includes two subpackages, the mann.utils
package and the mann.layers
package. As the name implies, the mann.utils
package includes utilities which assist in model training. The mann.layers
package includes custom Keras-compatible layers which can be used to train sparse multitask models.
Utils
The mann.utils
subpackage contains helper functions for performing training and conversion of models using masking layers.
In addition to the functions just mentioned, there is also an ActiveSparsification
callback object which enables active sparsification during training rather than solely one-shot sparsification. Note that this callback currently only supports simultaneous training. We are working to support iterative training with this callback as well.
mask_model
- The
mask_model
function is central to the RSN2 training procedure and enables masking/pruning a model so a large percentage of the weights are inactive. - Inputs to the
mask_model
function are a TensorFlow model, a percentile in integer form, a method - either one of 'gradients' or 'magnitude', input data, and target data.
- The
get_custom_objects
- The
get_custom_objects
function takes no parameters and returns a dictionary of all custom objects required to load a model trained using this package.
- The
remove_layer_masks
- The
remove_layer_masks
function takes a trained model with masked layers and converts it to a model without masking layers.
- The
add_layer_masks
- The
add_layer_masks
function takes an existing model that has non-MANN layers and converts it so that all layers which have an analog in the MANN package. This enables pretrained models to be converted and sparsified.
- The
quantize_model
- The
quantize_model
function takes in a model and a datatype to quantize the model to.
- The
build_transformer_block
- The
build_transformer_block
function can be used to build a block in a transformer architecture.
- The
build_token_position_embedding
- The
build_token_position_embedding
function can be used to build a token and position embedding block for use in a transformer architecture model.
- The
- The
get_task_masking_gradients
function retrieves the gradients of masking weights within a model. - The
mask_task_weights
function masks specific task weights within a model. - The
train_model_iteratively
function iteratively trains a model utilizing early stopping and active sparsification on a per-task basis. NOTE that this function only works on models withoutMultiMaskedConv2D
layers.
Layers
The mann.layers
subpackage contains custom Keras-compatible layers which can be used to train sparse multitask models. The layers contained in this package are as follows:
MaskedDense
- This layer is nearly identical to the Keras Dense layer, but it supports masking and pruning to reduce the number of active weights.
MaskedConv2D
- This layer is nearly identical to the Keras Conv2D layer, but it supports masking and pruning to reduce the number of active weights.
MultiMaskedDense
- This layer supports isolating pathways within the network and dedicating them for individual tasks and performing fully-connected operations on the input data.
MultiMaskedConv2D
- This layer supports isolating pathways within the network and dedicating them for individual tasks and performing convolutional operations on the input data.
MultiDense
- This layer supports multitask inference using a fully-connected architecture and is not designed for training. Once a model is trained with the
MultiMaskedDense
layer, that layer can be converted into this layer for inference by using themann.utils.remove_layer_masks
function.
- This layer supports multitask inference using a fully-connected architecture and is not designed for training. Once a model is trained with the
MultiConv2D
- This layer supports multitask inference using a convolutional architecture and is not designed for training. Once a model is trained with the
MultiMaskedConv2D
layer, that layer can be converted to this layer for inference by using themann.utils.remove_layer_masks
function.
- This layer supports multitask inference using a convolutional architecture and is not designed for training. Once a model is trained with the
SelectorLayer
- This layer selects which of the multiple inputs fed into it is returned as a result. This layer is designed to be used specifically with multitask layers.
SumLayer
- This layer returns the element-wise sum of all of the inputs.
FilterLayer
- This layer can be turned on or off, and indicates whether the single input passed to it should be output or if all zeros should be returned.
MultiMaxPool2D
- This layer implements Max Pool operations on multitask inputs.
Additional Documentation and Training Materials
Additional documentation and training materials will be added to the BeyondML Documentation Website as we continue to develop this project and its capabilities.
Feature Roadmap
- PyTorch Support
- We are currently working on building support for PyTorch models and layers into this package
- Fixing issues with iterative training and
MultiMaskedConv2D
layers- As mentioned above, there are issues with finding pert-task gradients with models utilizing
MultiMaskedConv2D
layers. Future iterations of the technology will address bugs with these kinds of models
- As mentioned above, there are issues with finding pert-task gradients with models utilizing
Changes
Below are a list of additional features, bug fixes, and other changes made for each version.
Version 0.2.2
- Small documentation changes
- Added
quantize_model
function - Added
build_transformer_block
andbuild_token_position_embedding_block
functions for transformer functionality - Removed unnecessary imports breaking imports in minimal environments
Version 0.2.3
- Per-task pruning
- Functionality for this feature is implemented, but usage is expected to be incomplete. Note that task gradients have to be passed retrieved and passed to the function directly (helper function available), and that the model has to initially be compiled using a compatible loss function (recommended 'mse') to identify gradients.
- It has been found that this functionality is currently only supported for models with the following layers:
- MaskedConv2D
- MaskedDense
- MultiMaskedDense
- Note also that this functionality does not support cases where layers of an individual model are other TensorFlow models, but supporting this functionality is on the roadmap.
- Iterative training using per-task pruning
- Functionality for this feature is implemented, but there are known bugs when trying to apply this methodology to models with the
MultiMaskedConv2D
layer present
- Functionality for this feature is implemented, but there are known bugs when trying to apply this methodology to models with the
Version 0.3.0
- Support for PyTorch layers
- Support for additional custom objects in the
quantize_model
function - Added tests to the package functionality
- Added auto-generated documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mann-0.3.0.tar.gz
.
File metadata
- Download URL: mann-0.3.0.tar.gz
- Upload date:
- Size: 28.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99c6f167fba12a82caf5a02714571bce43a83f9c0e60b796a4f43faf29d2c1b2 |
|
MD5 | b08ab3c9c62c54cc3e697ddfc9945e30 |
|
BLAKE2b-256 | cbc86e159c35885f6db216ac68b9a634acfa6afde099effb6352497356b3388e |
File details
Details for the file mann-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: mann-0.3.0-py3-none-any.whl
- Upload date:
- Size: 43.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8a38037f21d43bb614e71d1081c6db21efca5f182633a9e037aa37ac3a3cfa4 |
|
MD5 | 3b6545c8361ca1ffcaa316b25cfb43a8 |
|
BLAKE2b-256 | cafcef96ea1eeab3e7d6f7dd724f1a8f1820fd1ed4561b48245c8a91c0cd7f68 |