Minimalistic & easy deployment of PyTorch models on AWS Lambda with C++
Project description
Docs | Status | Package | Python | PyTorch | Docker | CodeBeat |
---|---|---|---|---|---|---|
torchlambda is a tool to deploy PyTorch models on Amazon's AWS Lambda using AWS SDK for C++ and custom C++ runtime.
Using static compilation of dependencies package size is only 30 Mb
with all necessary dependencies.
This allows users to pass their models as AWS Lambda layers,
hence no other dependencies like Amazon S3 are required.
torchlambda is always fully up to date due to continuous deployment at 03:00
every day.
This README provides only basic introduction, for full picture
please see documentation
and CLI --help
.
Comparison with other deployment tools
Improve this comparison's reliability via Pull Request, thanks. Also show guys below some love by visiting their projects (just click on the name).
Trait / Tool | torchlambda | fastai Lambda | KubeFlow | Tensorflow Serving |
---|---|---|---|---|
Autoscaling | :heavy_check_mark: | :heavy_check_mark: | with Kubernetes | with Kubernetes |
Light/Heavy load | Light | Light | Heavy/Both | Both |
GPU Support | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: |
Serverless | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: |
Required services | AWS Lambda | AWS Lambda, AWS S3 | Kubernetes Cluster & cloud provider | Deployable in various settings |
Multiple frameworks | :x: | :x: | :heavy_check_mark: | :x: |
Latest framework 1 | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: |
Version (higher more mature) | CD | N/A | 1.0 | 2.1.0 |
Customizable dependencies 2 | :heavy_check_mark: | :x: | :x: | :x: |
Deployment size 3 | ~30Mb | +1Gb | N/A | ~67Mb4 |
Installation
-
Docker at least of version
17.05
is required. See Official Docker's documentation for installation instruction specific to your operating system -
Install
torchlambda
through pip, Python version3.6
or higher is needed. You could also install this software within conda or other virtual environment of your choice. Following command should be sufficient:$ pip install --user torchlambda
torchlambda provides pre-built deployment images tagged after PyTorch versions and rebuilt daily. Following images are currently available:
szymonmaszke/torchlambda:latest
(head of current PyTorch master branch)szymonmaszke/torchlambda:1.4.0
For more info refer to torchlambda build
documentation.
Example deploy
Here is an example of ResNet18 model deployment using torchlambda
.
Run and create all necessary files in the same directory.
1. Create model to deploy
Below is a code (model.py
) to load ResNet
from torchvision
and compile is to torchscript
:
import torch
import torchvision
model = torchvision.models.resnet18()
# Smaller example
example = torch.randn(1, 3, 64, 64)
script_model = torch.jit.trace(model, example)
script_model.save("model.ptc")
Invoke it from CLI:
$ python model.py
You should get model.ptc
in your current working directory.
2. Create settings
torchlambda
uses C++ to deploy models hence it might be harder for end users
to provide necessary source code.
To alleviate some of those issues easy to understand YAML
settings can be used
to define outputs
and various elements of neural network and deployment.
Please run the following:
torchlambda settings
This command will generate torchlambda.yaml
file with all available commands
for you to modify according to your needs. You can see all of them
with short description below.
Click here to check generated YAML settings
---
grad: False # Turn gradient on/off
validate_json: true # Validate correctnes of JSON parsing
data: data # Name of data field passed as JSON
validate_data: true # Validate correctness of data from request
model: /opt/model.ptc # Path to model to load
inputs: [1, 3, width, height] # Shape of input tensor (can be name of field)
validate_inputs: true # Validate correctness of input fields (if any)
cast: float # Type to which base64 encoded tensor will be casted
divide: 255 # Value by which it will be divided
normalize: # Whether to normalize the tensor
means: [0.485, 0.456, 0.406] # Using those means
stddevs: [0.229, 0.224, 0.225] # And those standard deviations
return: # Finally return something in JSON
output: # Unmodified output from neural network
type: double # Casted to double type (AWS SDK compatible)
name: output # Name of the field where value(s) will be returned
item: false # If we return single value use True, neural network usually returns more (an array)
result: # Return another field result by modifying output
operations: argmax # Apply argmax (more operations can be specified as list)
arguments: 1 # Over first dimension (more or no arguments can be specified)
type: int # Type returned will be integer
name: result # Named result
item: true # It will be a single item
Many fields already have sensible defaults (see YAML settings file reference) hence they will be left for now. In our case we will only define bare minimum:
---
inputs: [1, channels, width, height]
return:
result:
operations: argmax
type: int
name: label
item: true
inputs: [1, channels, width, height]
- tensor of batch size equal to1
always (static), variable number of channels and variable width and height. Last three elements will be passed asint
fields in JSON request and named accordingly (channels
,width
andheight
).return
- return output of the network modified byargmax
operation which createsresult
. Our returnedtype
will beint
, and JSON fieldname
(torchlambda
always returns JSONs) will belabel
.argmax
overtensor
will create single (by default the operation is applied over all dimension), henceitem
is specified.
Save the above content in torchlambda.yaml
file.
3. Create deployment code with torchlambda scheme
Now if we have our settings we can generate C++ code based on it. Run the following:
$ torchlambda scheme --yaml torchlambda.yaml
You should see a new folder called torchlambda
in your current directory
with main.cpp
file inside.
If you don't care about C++ you can move on to the next section. If you want to know a little more (or have custom application), carry on reading.
If YAML
settings cannot fulfil your needs torchlambda
offers you a basic C++ scheme
you can start your deployment code from.
Run this simple command (no settings needed in this case):
$ torchlambda scheme --destination custom_deployment
This time you can find new folder custom_deployment
with main.cpp
inside.
This file is a minimal reasonable and working C++ code one should be able to follow
easily. It does exactly the same thing (except dynamic shapes) as we did above
via settings but this time the file is readable (previous main.cpp
might be quite hard to grasp
as it's "autogenerated").
Click here to check generated code
#include <aws/core/Aws.h>
#include <aws/core/utils/base64/Base64.h>
#include <aws/core/utils/json/JsonSerializer.h>
#include <aws/core/utils/memory/stl/AWSString.h>
#include <aws/lambda-runtime/runtime.h>
#include <torch/script.h>
#include <torch/torch.h>
/*!
*
* HANDLE REQUEST
*
*/
static aws::lambda_runtime::invocation_response
handler(torch::jit::script::Module &module,
const Aws::Utils::Base64::Base64 &transformer,
const aws::lambda_runtime::invocation_request &request) {
const Aws::String data_field{"data"};
/*!
*
* PARSE AND VALIDATE REQUEST
*
*/
const auto json = Aws::Utils::Json::JsonValue{request.payload};
if (!json.WasParseSuccessful())
return aws::lambda_runtime::invocation_response::failure(
"Failed to parse input JSON file.", "InvalidJSON");
const auto json_view = json.View();
if (!json_view.KeyExists(data_field))
return aws::lambda_runtime::invocation_response::failure(
"Required data was not provided.", "InvalidJSON");
/*!
*
* LOAD DATA, TRANSFORM TO TENSOR, NORMALIZE
*
*/
const auto base64_data = json_view.GetString(data_field);
Aws::Utils::ByteBuffer decoded = transformer.Decode(base64_data);
torch::Tensor tensor =
torch::from_blob(decoded.GetUnderlyingData(),
{
static_cast<long>(decoded.GetLength()),
},
torch::kUInt8)
.reshape({1, 3, 64, 64})
.toType(torch::kFloat32) /
255.0;
torch::Tensor normalized_tensor = torch::data::transforms::Normalize<>{
{0.485, 0.456, 0.406}, {0.229, 0.224, 0.225}}(tensor);
/*!
*
* MAKE INFERENCE
*
*/
auto output = module.forward({normalized_tensor}).toTensor();
const int label = torch::argmax(output).item<int>();
/*!
*
* RETURN JSON
*
*/
return aws::lambda_runtime::invocation_response::success(
Aws::Utils::Json::JsonValue{}
.WithInteger("label", label)
.View()
.WriteCompact(),
"application/json");
}
int main() {
/*!
*
* LOAD MODEL ON CPU
* & SET IT TO EVALUATION MODE
*
*/
torch::NoGradGuard no_grad_guard{};
constexpr auto model_path = "/opt/model.ptc";
torch::jit::script::Module module = torch::jit::load(model_path, torch::kCPU);
module.eval();
/*!
*
* INITIALIZE AWS SDK
* & REGISTER REQUEST HANDLER
*
*/
Aws::SDKOptions options;
Aws::InitAPI(options);
{
const Aws::Utils::Base64::Base64 transformer{};
const auto handler_fn =
[&module,
&transformer](const aws::lambda_runtime::invocation_request &request) {
return handler(module, transformer, request);
};
aws::lambda_runtime::run_handler(handler_fn);
}
Aws::ShutdownAPI(options);
return 0;
}
4. Package your source with torchlambda deploy
Now we have our model and source code. It's time to deploy it as AWS Lambda
ready .zip
package.
Run from command line:
$ torchlambda deploy ./torchlambda --compilation "-Wall -O2"
Above will create torchlambda.zip
file ready for deploy.
Notice --compilation
where you can pass any C++ compilation flags (here -O2
for increased performance).
There are many more things one could set during this step, check torchlambda deploy --help
for full list of available options.
5. Package your model as AWS Lambda Layer
As the above source code is roughly 30Mb
in size (AWS Lambda has 250Mb
limit),
we can put our model as additional layer. To create it run:
$ torchlambda model ./model.ptc --destination "model.zip"
You will receive model.zip
layer in your current working directory (--destination
is optional).
6. Deploy to AWS Lambda
From now on you could mostly follow tutorial from AWS Lambda's C++ Runtime. It is assumed you have AWS CLI configured, check Configuring the AWS CLI otherwise (or see Test Lambda deployment locally tutorial)
6.1 Create trust policy JSON file
First create the following trust policy JSON file:
$ cat trust-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": ["lambda.amazonaws.com"]
},
"Action": "sts:AssumeRole"
}
]
}
6.2 Create IAM role trust policy JSON file
Run from your shell:
$ aws iam create-role --role-name demo --assume-role-policy-document file://trust-policy.json
Note down the role Arn
returned to you after running that command, it will be needed during next step.
6.3 Create AWS Lambda function
Create deployment function with the script below:
$ aws lambda create-function --function-name demo \
--role <specify role arn from step 5.2 here> \
--runtime provided --timeout 30 --memory-size 1024 \
--handler torchlambda --zip-file fileb://torchlambda.zip
6.4 Create AWS Layer containing model
We already have our ResNet18
packed appropriately so run the following to make a layer
from it:
$ aws lambda publish-layer-version --layer-name model \
--description "Resnet18 neural network model" \
--license-info "MIT" \
--zip-file fileb://model.zip
Please save the LayerVersionArn
just like in 6.2
and insert below to add this layer
to function from previous step:
$ aws lambda update-function-configuration \
--function-name demo \
--layers <specify layer arn from above here>
This configures whole deployment, now we our model is ready to get incoming requests.
7. Encode image with base64
and make a request
Following script (save it as request.py
) will send image-like tensor
encoded using base64
via aws lambda invoke
to test our function.
import argparse
import base64
import shlex
import struct
import subprocess
import sys
import numpy as np
def parse_arguments():
parser = argparse.ArgumentParser(formatter_class=argparse.RawTextHelpFormatter)
parser.add_argument("function-name")
parser.add_argument("channels", type=int)
parser.add_argument("width", type=int)
parser.add_argument("height", type=int)
parser.add_argument("output", type=int)
return parser.parse_args()
def request(args):
# Flatten to send as byte payload
random_image = (
np.random.randint(
low=0, high=255, size=(1, args.channels, args.width, args.height)
)
.flatten()
.tolist()
)
# Encode using bytes for AWS Lambda compatibility
image = struct.pack("<{}B".format(len(data)), *data)
# Encode as base64 string
encoded = base64.b64encode(image)
command = (
"""aws lambda invoke --function-name {} --payload"""
"""'{{"data": "{}", "channels": {}, "width": {}, "height": {} }}' {}""".format(
args.function_name,
args.channels,
args.width,
args.height,
encoded,
args.output,
)
)
subprocess.call(shlex.split(command))
if __name__ == "__main__":
args = parse_arguments()
request(args)
Run above script:
$ python request.py demo 3 64 64 output.json
You should get the following response in output.json
(your label may vary):
cat output.txt
{"label": 40}
Congratulations, you have deployed ResNet18 classifier using only AWS Lambda in 7 steps!
Contributing
If you find issue or would like to see some functionality (or implement one), please open new Issue or create Pull Request.
Footnotes
1. Support for latest version of it's main DL framework or main frameworks if multiple supported
2. Project dependencies shape can be easily cutomized. In torchlambda case it is customizable
build of libtorch
and AWS C++ SDK
3. Necessary size of code and dependencies to deploy model
4. Based on Dockerfile size
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file torchlambda-1585885600.tar.gz
.
File metadata
- Download URL: torchlambda-1585885600.tar.gz
- Upload date:
- Size: 31.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e66db62ab22a44b3cc08d1d772d69174c28ce7259d09327d7fe3bf69a1c74c87 |
|
MD5 | 8184ec63aa6c80077649cc0402731d11 |
|
BLAKE2b-256 | 07bbad69815004b3159e8732e3939924762d6ec1c535d404a533c29415cef313 |
File details
Details for the file torchlambda-1585885600-py3-none-any.whl
.
File metadata
- Download URL: torchlambda-1585885600-py3-none-any.whl
- Upload date:
- Size: 31.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5c0e3b09ec237d9e77da4aa3b92727fe3ab191dfc36dcc0bb4d00f320d272ad |
|
MD5 | c00b7d0f627968159daa5e2aa4d6896f |
|
BLAKE2b-256 | f026f4c81fd11bda5c998556a89531ffffb7e3fba6961c0b75ae0173759c9b82 |