Skip to main content

Minimalistic & easy deployment of PyTorch models on AWS Lambda with C++

Project description

torchlambda Logo


Version PyPI Python PyTorch Docker
Version Py at least of version 17.05 is required.PI Python PyTorch Docker

torchlambda is a software designed to deply PyTorch models on Amazon's AWS Lambda cloud service using AWS SDK for C++ and custom C++ runtime.

Using static compilation size of source code is only 30 Mb with all necessary dependencies. This allows users to pass their models as AWS Lambda layers, hence no other dependencies like Amazon S3 are required.

Comparison with other deployment tools

Improve this comparison's reliability via Pull Request, thanks. Also show guys below some love by visiting their projects (just click on the name).

Trait / Tool torchlambda fastai Lambda KubeFlow Tensorflow Serving
Autoscaling :heavy_check_mark: :heavy_check_mark: with Kubernetes with Kubernetes
Light/Heavy load Light Light Heavy/Both Both
GPU Support :x: :x: :heavy_check_mark: :heavy_check_mark:
Serverless :heavy_check_mark: :heavy_check_mark: :x: :x:
Required services AWS Lambda AWS Lambda, AWS S3 Kubernetes Cluster & cloud provider Deployable in various settings
Multiple frameworks :x: :x: :heavy_check_mark: :x:
Latest framework 1 :heavy_check_mark: :x: :x: :heavy_check_mark:
Version (higher more mature) 0.1.0 N/A 1.0 2.1.0
Customizable dependencies 2 :heavy_check_mark: :x: :x: :x:
Deployment size 3 ~30Mb +1Gb N/A ~67Mb4

Installation

  • Docker at least of version 17.05 is required. See Official Docker's documentation for installation instruction for your operating system

  • Install torchlambda through pip, Python version 3.6 or higher is needed. You could also install this software within conda or other virutal environment of your choice. Following command should be sufficient:

    $ pip install --user torchlambda
    

Example deploy

Here is an example of ResNet18 model deployment using torchlambda. Run and create all necessary files in the same directory.

1. Create model to deploy

Below is a code (model.py) to load ResNet from torchvision and compile is to torchscript:

import torch
import torchvision

model = torchvision.models.resnet18()

# Smaller example
example = torch.randn(1, 3, 64, 64)
script_model = torch.jit.trace(model, example)

script_model.save("model.ptc")

Invoke it from CLI:

$ python model.py

You should get model.ptc in your current working directory.

2. Create deployment code with torchlambda scheme

Writing C++ code might be hard, hence torchlambda provides you with basic scheme where all you have to do is provide appropriate shapes for inference (either passed during request or hard-coded).

Issue following command:

$ torchlambda scheme

You should see a new folder called torchlambda in your current directory. Contents of torchlambda/main.cpp are the ones you would usually modify.

Only a few usually changes like (e.g. input shape or required fields).

If you wish to see the generated C++ scheme code (barely 70 lines) click below:

Click here to check generated code

Code below should be quite easy to follow. Check comments if in doubt or request improvements in Issues or make a Pull Request if you have an idea to make this section even easier.

#include <algorithm>
#include <iterator>

#include <aws/core/Aws.h>
#include <aws/core/utils/base64/Base64.h>
#include <aws/core/utils/json/JsonSerializer.h>
#include <aws/core/utils/memory/stl/AWSString.h>

#include <aws/lambda-runtime/runtime.h>

#include <torch/script.h>
#include <torch/torch.h>

static aws::lambda_runtime::invocation_response
handler(torch::jit::script::Module &module,
        const Aws::Utils::Base64::Base64 &transformer,
        const aws::lambda_runtime::invocation_request &request) {

  /* Name of field containing base64 encoded data */
  const Aws::String data_field{"data"};

  /*!
   *
   *               PARSE AND VALIDATE REQUEST
   *
   */

  const auto json = Aws::Utils::Json::JsonValue{request.payload};
  if (!json.WasParseSuccessful())
    return aws::lambda_runtime::invocation_response::failure(
        "Failed to parse input JSON file.", "InvalidJSON");

  const auto json_view = json.View();
  if (!json_view.KeyExists(data_field))
    return aws::lambda_runtime::invocation_response::failure(
        "Required data was not provided.", "InvalidJSON");

  /*!
   *
   *            LOAD DATA, TRANSFORM TO TENSOR, NORMALIZE
   *
   */

  const auto base64_data = json_view.GetString(data_field);
  Aws::Utils::ByteBuffer decoded = transformer.Decode(base64_data);

  /* Copy data and move it to tensor (is there an easier way?) */
  /* Array holds channels * width * height, input your values below */
  float data[3 * 64 * 64];
  std::copy(decoded.GetUnderlyingData(),
            decoded.GetUnderlyingData() + decoded.GetLength() - 1, data);

  torch::Tensor tensor =
      torch::from_blob(data,
                       {
                           static_cast<long int>(decoded.GetLength()),
                       })
          /* Input your data shape for reshape including batch */
          .reshape({1, 3, 64, 64})
          .toType(torch::kFloat32) /
      255.0;

  /* Normalize tensor with ImageNet mean and stddev */
  torch::Tensor normalized_tensor = torch::data::transforms::Normalize<>{
      {0.485, 0.456, 0.406}, {0.229, 0.224, 0.225}}(tensor);

  /*!
   *
   *              MAKE INFERENCE AND RETURN JSON RESPONSE
   *
   */

  /* {} will be casted to std::vector<torch::jit::IValue> under the hood */
  auto output = module.forward({normalized_tensor}).toTensor();
  const int label = torch::argmax(output).item<int>();

  /* Return JSON with field label containing predictions*/
  return aws::lambda_runtime::invocation_response::success(
      Aws::Utils::Json::JsonValue{}
          .WithInteger("label", label)
          .View()
          .WriteCompact(),
      "application/json");
}

int main() {
  /* Inference doesn't need gradient, let's turn it off */
  torch::NoGradGuard no_grad_guard{};

  /* Change name/path to your model if you so desire */
  /* Layers are unpacked to /opt, so you are better off keeping it */
  constexpr auto model_path = "/opt/model.ptc";

  /* You could add some checks whether the module is loaded correctly */
  torch::jit::script::Module module = torch::jit::load(model_path, torch::kCPU);

  module.eval();

  /*!
   *
   *                        INITIALIZE AWS SDK
   *                    & REGISTER REQUEST HANDLER
   *
   */

  Aws::SDKOptions options;
  Aws::InitAPI(options);
  {
    const Aws::Utils::Base64::Base64 transformer{};
    const auto handler_fn =
        [&module,
         &transformer](const aws::lambda_runtime::invocation_request &request) {
          return handler(module, transformer, request);
        };
    aws::lambda_runtime::run_handler(handler_fn);
  }
  Aws::ShutdownAPI(options);
  return 0;
}

3. Package your source with torchlambda deploy

Now we have our model and source code. It's time to deploy it as AWS Lambda ready .zip package.

Run from command line:

$ torchlambda deploy ./torchlambda --compilation "-Wall -O2"

Above will create torchlambda.zip file ready for deploy. Notice --compilation where you can pass any C++ compilation flags (here -O2 for performance optimization).

There are many more things one could set during this step, check torchlambda deploy --help for full list of available options.

Oh, any don't worry about OS compatibility as this code is compiled on Amazon's AMI Linux, if it works here it will work "up there".

4. Package your model as AWS Lambda Layer

As the above source code is roughly 30Mb in size (AWS Lambda has 250Mb limit), we can put our model as additional layer. To create it run:

$ torchlambda model ./model.ptc --destination "model.zip"

You will receive model.zip layer in your current working directory (--destination is optional).

5. Deploy to AWS Lambda

From now on you could mostly follow tutorial from AWS Lambda's C++ Runtime. It is assumed you have AWS CLI configured, if not check Configuring the AWS CLI.

5.1 Create trust policy JSON file

First create the following trust policy JSON file:

$ cat trust-policy.json
{
 "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": ["lambda.amazonaws.com"]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

5.2 Create IAM role trust policy JSON file

Run from your shell:

$ aws iam create-role --role-name demo --assume-role-policy-document file://trust-policy.json

Note down the role Arn returned to you after running that command, it will be needed during next step.

5.3 Create AWS Lambda function

Create deployment function with the script below:

$ aws lambda create-function --function-name demo \
  --role <specify role arn from step 5.2 here> \
  --runtime provided --timeout 30 --memory-size 1024 \
  --handler torchlambda --zip-file fileb://torchlambda.zip

5.4 Create AWS Layer containing model

We already have our ResNet18 packed appropriately, run the following:

$ aws lambda publish-layer-version --layer-name model \
  --description "Resnet18 neural network model" \
  --license-info "MIT" \
  --zip-file fileb://model.zip

Please save the LayerVersionArn similar to step 5.2 and insert it below to add this layer to function from step 5.3:

$ aws lambda update-function-configuration \
  --function-name demo \
  --layers <specify layer arn from above here>

6. Encode image with base64 and request your function

Following script (save it as request.py) will send image-like tensor encoded using base64 via aws lambda invoke to test our function.

import base64
import shlex
import struct
import subprocess
import sys

import numpy as np

# Random image-like data
data = np.random.randint(low=0, high=255, size=(3, 64, 64)).flatten().tolist()
# Encode using bytes for AWS Lambda compatibility
image = struct.pack("<{}B".format(len(data)), *data)
encoded = base64.b64encode(image)
command = """aws lambda invoke --function-name %s --payload '{"data":"%s"}' %s""" % (
    sys.argv[1],
    encoded,
    sys.argv[2],
)

subprocess.call(shlex.split(command))

Run above script:

$ python request.py demo output.txt

You should get the following response in output.txt (your label may vary):

cat output.txt
  {"label": 40}

Congratulations, you have deployed ResNet18 classifier using only AWS Lambda in 6 steps!

Contributing

If you find issue or would like to see some functionality (or implement one), please open new Issue or create Pull Request.

Footnotes

1. Support for latest version of it's main DL framework or main frameworks if multiple supported

2. Project dependencies shape can be easily cutomized. In torchlambda case it is customizable build of libtorch and AWS C++ SDK

3. Necessary size of code and dependencies to deploy model

4. Based on Dockerfile size

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchlambda-0.1.0.tar.gz (21.2 kB view hashes)

Uploaded Source

Built Distribution

torchlambda-0.1.0-py3-none-any.whl (19.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page