Femtosense Model Optimization Toolkit
Project description
fmot
The Femtosense Model Optimization Toolkit (fmot) quantizes neural network models for deployment on Femtosense hardware.
Installation
git clone https://github.com/femtosense/fmot.git
cd fmot
pip install -e .
Quantizing Models
You get to define your pytorch models however you want. Once your model has been trained, it can be converted to the fmot.qat
format by calling fmot.convert.convert_torch_to_qat
. This resulting qat
model will initially not be quantized. To quantize it, provide your model, along with an iteratable of sample inputs, to fmot.qat.control.quantize
. These test inputs will help the qat
model to find an optimal quantization configuration. The resulting quantized model will now simulate the fixed-point integer arithmetic, exactly how it will be performed on Femtosense hardware.
import torch
import fmot
class MyModel(torch.nn.Module):
def __init__(self, din, dout):
super().__init__()
weights = torch.rand(din, dout)
self.weight = torch.nn.Parameter(weights)
self.linear = torch.nn.Linear(dout, dout)
def forward(self, x):
x = torch.matmul(x, self.weight)
x = x.relu()
x = self.linear(x)
x = torch.sigmoid(x)
return x
model = MyModel(128, 256)
### TRAINING GOES HERE
# Convert the trained model to qat format
quant_model = fmot.convert.convert_torch_to_qat(model)
# Provide a set of sample inputs to choose an optimal quantization scheme
quant_model = fmot.qat.control.quantize(quant_model, [torch.randn(16, 128) for __ in range(20)])
NOTE: THE ABOVE API NEEDS A TOP-LEVEL SHORTCUT
Fine-Tuning Quantized Models
Setting Custom Bitwidths
Emitting FQIR
Once your model has been quantized,
Building and Viewing Sphinx Documentation
First, let's install sphinx. On macOs:
brew install sphinx-doc
On other platforms.
Now, let's install some dependencies with pip:
cd docs
pip install -r requirements.txt
You can now build the documentation by running
make html
This documentation can be viewed in your browser with Open File
(⌘O). Navigate to
{fmot_base}/docs/_build/html/index.html
Running Tests
Pruning Weight Matrices
Sparsifying Activations
Using Custom Layers
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.