Minimalistic & easy deployment of PyTorch models on AWS Lambda with C++
Project description
Version | PyPI | Python | PyTorch | Docker |
---|---|---|---|---|
torchlambda is a software designed to deply PyTorch models on Amazon's AWS Lambda cloud service using AWS SDK for C++ and custom C++ runtime.
Using static compilation size of source code is only 30 Mb
with all necessary dependencies.
This allows users to pass their models as AWS Lambda layers,
hence no other dependencies like Amazon S3 are required.
Comparison with other deployment tools
Improve this comparison's reliability via Pull Request, thanks. Also show guys below some love by visiting their projects (just click on the name).
Trait / Tool | torchlambda | fastai Lambda | KubeFlow | Tensorflow Serving |
---|---|---|---|---|
Autoscaling | :heavy_check_mark: | :heavy_check_mark: | with Kubernetes | with Kubernetes |
Light/Heavy load | Light | Light | Heavy/Both | Both |
GPU Support | :x: | :x: | :heavy_check_mark: | :heavy_check_mark: |
Serverless | :heavy_check_mark: | :heavy_check_mark: | :x: | :x: |
Required services | AWS Lambda | AWS Lambda, AWS S3 | Kubernetes Cluster & cloud provider | Deployable in various settings |
Multiple frameworks | :x: | :x: | :heavy_check_mark: | :x: |
Latest framework 1 | :heavy_check_mark: | :x: | :x: | :heavy_check_mark: |
Version (higher more mature) | 0.1.0 | N/A | 1.0 | 2.1.0 |
Customizable dependencies 2 | :heavy_check_mark: | :x: | :x: | :x: |
Deployment size 3 | ~30Mb | +1Gb | N/A | ~67Mb4 |
Installation
-
Docker at least of version
17.05
is required. See Official Docker's documentation for installation instruction for your operating system -
Install
torchlambda
through pip, Python version3.6
or higher is needed. You could also install this software within conda or other virutal environment of your choice. Following command should be sufficient:$ pip install --user torchlambda
Example deploy
Here is an example of ResNet18 model deployment using torchlambda
.
Run and create all necessary files in the same directory.
1. Create model to deploy
Below is a code (model.py
) to load ResNet
from torchvision
and compile is to torchscript
:
import torch
import torchvision
model = torchvision.models.resnet18()
# Smaller example
example = torch.randn(1, 3, 64, 64)
script_model = torch.jit.trace(model, example)
script_model.save("model.ptc")
Invoke it from CLI:
$ python model.py
You should get model.ptc
in your current working directory.
2. Create deployment code with torchlambda scheme
Writing C++ code might be hard, hence torchlambda
provides you
with basic scheme where all you have to do is provide appropriate shapes for inference
(either passed during request or hard-coded).
Issue following command:
$ torchlambda scheme
You should see a new folder called torchlambda
in your current directory.
Contents of torchlambda/main.cpp
are the ones you would usually modify.
Only a few usually changes like (e.g. input shape
or required fields).
If you wish to see the generated C++ scheme code (barely 70
lines) click below:
Click here to check generated code
Code below should be quite easy to follow. Check comments if in doubt or request improvements in Issues or make a Pull Request if you have an idea to make this section even easier.
#include <algorithm>
#include <iterator>
#include <aws/core/Aws.h>
#include <aws/core/utils/base64/Base64.h>
#include <aws/core/utils/json/JsonSerializer.h>
#include <aws/core/utils/memory/stl/AWSString.h>
#include <aws/lambda-runtime/runtime.h>
#include <torch/script.h>
#include <torch/torch.h>
static aws::lambda_runtime::invocation_response
handler(torch::jit::script::Module &module,
const Aws::Utils::Base64::Base64 &transformer,
const aws::lambda_runtime::invocation_request &request) {
/* Name of field containing base64 encoded data */
const Aws::String data_field{"data"};
/*!
*
* PARSE AND VALIDATE REQUEST
*
*/
const auto json = Aws::Utils::Json::JsonValue{request.payload};
if (!json.WasParseSuccessful())
return aws::lambda_runtime::invocation_response::failure(
"Failed to parse input JSON file.", "InvalidJSON");
const auto json_view = json.View();
if (!json_view.KeyExists(data_field))
return aws::lambda_runtime::invocation_response::failure(
"Required data was not provided.", "InvalidJSON");
/*!
*
* LOAD DATA, TRANSFORM TO TENSOR, NORMALIZE
*
*/
const auto base64_data = json_view.GetString(data_field);
Aws::Utils::ByteBuffer decoded = transformer.Decode(base64_data);
/* Copy data and move it to tensor (is there an easier way?) */
/* Array holds channels * width * height, input your values below */
float data[3 * 64 * 64];
std::copy(decoded.GetUnderlyingData(),
decoded.GetUnderlyingData() + decoded.GetLength() - 1, data);
torch::Tensor tensor =
torch::from_blob(data,
{
static_cast<long int>(decoded.GetLength()),
})
/* Input your data shape for reshape including batch */
.reshape({1, 3, 64, 64})
.toType(torch::kFloat32) /
255.0;
/* Normalize tensor with ImageNet mean and stddev */
torch::Tensor normalized_tensor = torch::data::transforms::Normalize<>{
{0.485, 0.456, 0.406}, {0.229, 0.224, 0.225}}(tensor);
/*!
*
* MAKE INFERENCE AND RETURN JSON RESPONSE
*
*/
/* {} will be casted to std::vector<torch::jit::IValue> under the hood */
auto output = module.forward({normalized_tensor}).toTensor();
const int label = torch::argmax(output).item<int>();
/* Return JSON with field label containing predictions*/
return aws::lambda_runtime::invocation_response::success(
Aws::Utils::Json::JsonValue{}
.WithInteger("label", label)
.View()
.WriteCompact(),
"application/json");
}
int main() {
/* Inference doesn't need gradient, let's turn it off */
torch::NoGradGuard no_grad_guard{};
/* Change name/path to your model if you so desire */
/* Layers are unpacked to /opt, so you are better off keeping it */
constexpr auto model_path = "/opt/model.ptc";
/* You could add some checks whether the module is loaded correctly */
torch::jit::script::Module module = torch::jit::load(model_path, torch::kCPU);
module.eval();
/*!
*
* INITIALIZE AWS SDK
* & REGISTER REQUEST HANDLER
*
*/
Aws::SDKOptions options;
Aws::InitAPI(options);
{
const Aws::Utils::Base64::Base64 transformer{};
const auto handler_fn =
[&module,
&transformer](const aws::lambda_runtime::invocation_request &request) {
return handler(module, transformer, request);
};
aws::lambda_runtime::run_handler(handler_fn);
}
Aws::ShutdownAPI(options);
return 0;
}
3. Package your source with torchlambda deploy
Now we have our model and source code. It's time to deploy it as AWS Lambda
ready .zip
package.
Run from command line:
$ torchlambda deploy ./torchlambda --compilation "-Wall -O2"
Above will create torchlambda.zip
file ready for deploy.
Notice --compilation
where you can pass any C++ compilation flags (here -O2
for performance optimization).
There are many more things one could set during this step, check torchlambda deploy --help
for full list of available options.
Oh, any don't worry about OS compatibility as this code is compiled on Amazon's AMI Linux, if it works here it will work "up there".
4. Package your model as AWS Lambda Layer
As the above source code is roughly 30Mb
in size (AWS Lambda has 250Mb
limit),
we can put our model as additional layer. To create it run:
$ torchlambda model ./model.ptc --destination "model.zip"
You will receive model.zip
layer in your current working directory (--destination
is optional).
5. Deploy to AWS Lambda
From now on you could mostly follow tutorial from AWS Lambda's C++ Runtime. It is assumed you have AWS CLI configured, if not check Configuring the AWS CLI.
5.1 Create trust policy JSON file
First create the following trust policy JSON file:
$ cat trust-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": ["lambda.amazonaws.com"]
},
"Action": "sts:AssumeRole"
}
]
}
5.2 Create IAM role trust policy JSON file
Run from your shell:
$ aws iam create-role --role-name demo --assume-role-policy-document file://trust-policy.json
Note down the role Arn
returned to you after running that command, it will be needed during next step.
5.3 Create AWS Lambda function
Create deployment function with the script below:
$ aws lambda create-function --function-name demo \
--role <specify role arn from step 5.2 here> \
--runtime provided --timeout 30 --memory-size 1024 \
--handler torchlambda --zip-file fileb://torchlambda.zip
5.4 Create AWS Layer containing model
We already have our ResNet18
packed appropriately, run the following:
$ aws lambda publish-layer-version --layer-name model \
--description "Resnet18 neural network model" \
--license-info "MIT" \
--zip-file fileb://model.zip
Please save the LayerVersionArn
similar to step 5.2
and insert it below to add this layer
to function from step 5.3
:
$ aws lambda update-function-configuration \
--function-name demo \
--layers <specify layer arn from above here>
6. Encode image with base64
and request your function
Following script (save it as request.py
) will send image-like tensor
encoded using base64
via aws lambda invoke
to test our function.
import base64
import shlex
import struct
import subprocess
import sys
import numpy as np
# Random image-like data
data = np.random.randint(low=0, high=255, size=(3, 64, 64)).flatten().tolist()
# Encode using bytes for AWS Lambda compatibility
image = struct.pack("<{}B".format(len(data)), *data)
encoded = base64.b64encode(image)
command = """aws lambda invoke --function-name %s --payload '{"data":"%s"}' %s""" % (
sys.argv[1],
encoded,
sys.argv[2],
)
subprocess.call(shlex.split(command))
Run above script:
$ python request.py demo output.txt
You should get the following response in output.txt
(your label may vary):
cat output.txt
{"label": 40}
Congratulations, you have deployed ResNet18 classifier using only AWS Lambda in 6 steps!
Contributing
If you find issue or would like to see some functionality (or implement one), please open new Issue or create Pull Request.
Footnotes
1. Support for latest version of it's main DL framework or main frameworks if multiple supported
2. Project dependencies shape can be easily cutomized. In torchlambda case it is customizable
build of libtorch
and AWS C++ SDK
3. Necessary size of code and dependencies to deploy model
4. Based on Dockerfile size
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file torchlambda-0.1.0.tar.gz
.
File metadata
- Download URL: torchlambda-0.1.0.tar.gz
- Upload date:
- Size: 21.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d550363e3d969a77f3b2b80526bb819b2ee5b43b064f25f5206df0222292e37 |
|
MD5 | c9d9a58aca535a633de9803e2933f4a0 |
|
BLAKE2b-256 | 2966009e0a78d7e8765faaac56accc81b93d49cc7c32b740981a5f2e2097fcf9 |
File details
Details for the file torchlambda-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: torchlambda-0.1.0-py3-none-any.whl
- Upload date:
- Size: 19.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 72fe2517a86bfefea7babe1ff85e6b378a5dbaeeb6e8e59292db52404db81fcf |
|
MD5 | 4a091008166aa2dfdd49732f971b1614 |
|
BLAKE2b-256 | 1867abc5a313edd844205064a8f3da7f623ec59dc11e79210b3dad1c8e98a844 |