Skip to main content

Minimalistic & easy deployment of PyTorch models on AWS Lambda with C++

Project description

torchlambda Logo


Docs Status Package Python PyTorch Docker CodeBeat
Documentation CD PyPI Python PyTorch Docker codebeat badge

torchlambda is a tool to deploy PyTorch models on Amazon's AWS Lambda using AWS SDK for C++ and custom C++ runtime.

Using static compilation of dependencies package size is only 30 Mb with all necessary dependencies. This allows users to pass their models as AWS Lambda layers, hence no other dependencies like Amazon S3 are required.

torchlambda is always fully up to date due to continuous deployment at 03:00 every day. This README provides only basic introduction, for full picture please see documentation and CLI --help.

Comparison with other deployment tools

Improve this comparison's reliability via Pull Request, thanks. Also show guys below some love by visiting their projects (just click on the name).

Trait / Tool torchlambda fastai Lambda KubeFlow Tensorflow Serving
Autoscaling :heavy_check_mark: :heavy_check_mark: with Kubernetes with Kubernetes
Light/Heavy load Light Light Heavy/Both Both
GPU Support :x: :x: :heavy_check_mark: :heavy_check_mark:
Serverless :heavy_check_mark: :heavy_check_mark: :x: :x:
Required services AWS Lambda AWS Lambda, AWS S3 Kubernetes Cluster & cloud provider Deployable in various settings
Multiple frameworks :x: :x: :heavy_check_mark: :x:
Latest framework 1 :heavy_check_mark: :x: :x: :heavy_check_mark:
Version (higher more mature) CD N/A 1.0 2.1.0
Customizable dependencies 2 :heavy_check_mark: :x: :x: :x:
Deployment size 3 ~30Mb +1Gb N/A ~67Mb4

Installation

  • Docker at least of version 17.05 is required. See Official Docker's documentation for installation instruction specific to your operating system

  • Install torchlambda through pip, Python version 3.6 or higher is needed. You could also install this software within conda or other virtual environment of your choice. Following command should be sufficient:

    $ pip install --user torchlambda
    

torchlambda provides pre-built deployment images tagged after PyTorch versions and rebuilt daily. Following images are currently available:

  • szymonmaszke/torchlambda:latest (head of current PyTorch master branch)
  • szymonmaszke/torchlambda:1.4.0

For more info refer to torchlambda build documentation.

Example deploy

Here is an example of ResNet18 model deployment using torchlambda. Run and create all necessary files in the same directory.

1. Create model to deploy

Below is a code (model.py) to load ResNet from torchvision and compile is to torchscript:

import torch
import torchvision

model = torchvision.models.resnet18()

# Smaller example
example = torch.randn(1, 3, 64, 64)
script_model = torch.jit.trace(model, example)

script_model.save("model.ptc")

Invoke it from CLI:

$ python model.py

You should get model.ptc in your current working directory.

2. Create settings

torchlambda uses C++ to deploy models hence it might be harder for end users to provide necessary source code.

To alleviate some of those issues easy to understand YAML settings can be used to define outputs and various elements of neural network and deployment.

Please run the following:

torchlambda settings

This command will generate torchlambda.yaml file with all available commands for you to modify according to your needs. You can see all of them with short description below.

Click here to check generated YAML settings

---
grad: False # Turn gradient on/off
validate_json: true # Validate correctnes of JSON parsing
data: data # Name of data field passed as JSON
validate_data: true # Validate correctness of data from request
model: /opt/model.ptc # Path to model to load
inputs: [1, 3, width, height] # Shape of input tensor (can be name of field)
validate_inputs: true # Validate correctness of input fields (if any)
cast: float # Type to which base64 encoded tensor will be casted
divide: 255 # Value by which it will be divided
normalize: # Whether to normalize the tensor
  means: [0.485, 0.456, 0.406] # Using those means
  stddevs: [0.229, 0.224, 0.225] # And those standard deviations
return: # Finally return something in JSON
  output: # Unmodified output from neural network
    type: double # Casted to double type (AWS SDK compatible)
    name: output # Name of the field where value(s) will be returned
    item: false # If we return single value use True, neural network usually returns more (an array)
  result: # Return another field result by modifying output
    operations: argmax # Apply argmax (more operations can be specified as list)
    arguments: 1 # Over first dimension (more or no arguments can be specified)
    type: int # Type returned will be integer
    name: result # Named result
    item: true # It will be a single item
 

Many fields already have sensible defaults (see YAML settings file reference) hence they will be left for now. In our case we will only define bare minimum:

---
inputs: [1, channels, width, height]
return:
  result:
    operations: argmax
    type: int
    name: label
    item: true
  • inputs: [1, channels, width, height] - tensor of batch size equal to 1 always (static), variable number of channels and variable width and height. Last three elements will be passed as int fields in JSON request and named accordingly ( channels, width and height).
  • return - return output of the network modified by argmax operation which creates result. Our returned type will be int, and JSON field name (torchlambda always returns JSONs) will be label. argmax over tensor will create single (by default the operation is applied over all dimension), hence item is specified.

Save the above content in torchlambda.yaml file.

3. Create deployment code with torchlambda scheme

Now if we have our settings we can generate C++ code based on it. Run the following:

$ torchlambda scheme --yaml torchlambda.yaml

You should see a new folder called torchlambda in your current directory with main.cpp file inside.

If you don't care about C++ you can move on to the next section. If you want to know a little more (or have custom application), carry on reading.

If YAML settings cannot fulfil your needs torchlambda offers you a basic C++ scheme you can start your deployment code from.

Run this simple command (no settings needed in this case):

$ torchlambda scheme --destination custom_deployment

This time you can find new folder custom_deployment with main.cpp inside. This file is a minimal reasonable and working C++ code one should be able to follow easily. It does exactly the same thing (except dynamic shapes) as we did above via settings but this time the file is readable (previous main.cpp might be quite hard to grasp as it's "autogenerated").

Click here to check generated code

#include <aws/core/Aws.h>
#include <aws/core/utils/base64/Base64.h>
#include <aws/core/utils/json/JsonSerializer.h>
#include <aws/core/utils/memory/stl/AWSString.h>

#include <aws/lambda-runtime/runtime.h>

#include <torch/script.h>
#include <torch/torch.h>

/*!
 *
 *                    HANDLE REQUEST
 *
 */

static aws::lambda_runtime::invocation_response
handler(torch::jit::script::Module &module,
        const Aws::Utils::Base64::Base64 &transformer,
        const aws::lambda_runtime::invocation_request &request) {

  const Aws::String data_field{"data"};

  /*!
   *
   *              PARSE AND VALIDATE REQUEST
   *
   */

  const auto json = Aws::Utils::Json::JsonValue{request.payload};
  if (!json.WasParseSuccessful())
    return aws::lambda_runtime::invocation_response::failure(
        "Failed to parse input JSON file.", "InvalidJSON");

  const auto json_view = json.View();
  if (!json_view.KeyExists(data_field))
    return aws::lambda_runtime::invocation_response::failure(
        "Required data was not provided.", "InvalidJSON");

  /*!
   *
   *          LOAD DATA, TRANSFORM TO TENSOR, NORMALIZE
   *
   */

  const auto base64_data = json_view.GetString(data_field);
  Aws::Utils::ByteBuffer decoded = transformer.Decode(base64_data);

  torch::Tensor tensor =
      torch::from_blob(decoded.GetUnderlyingData(),
                       {
                           static_cast<long>(decoded.GetLength()),
                       },
                       torch::kUInt8)
          .reshape({1, 3, 64, 64})
          .toType(torch::kFloat32) /
      255.0;

  torch::Tensor normalized_tensor = torch::data::transforms::Normalize<>{
      {0.485, 0.456, 0.406}, {0.229, 0.224, 0.225}}(tensor);

  /*!
   *
   *                      MAKE INFERENCE
   *
   */

  auto output = module.forward({normalized_tensor}).toTensor();
  const int label = torch::argmax(output).item<int>();

  /*!
   *
   *                       RETURN JSON
   *
   */

  return aws::lambda_runtime::invocation_response::success(
      Aws::Utils::Json::JsonValue{}
          .WithInteger("label", label)
          .View()
          .WriteCompact(),
      "application/json");
}

int main() {
  /*!
   *
   *                        LOAD MODEL ON CPU
   *                    & SET IT TO EVALUATION MODE
   *
   */

  torch::NoGradGuard no_grad_guard{};
  constexpr auto model_path = "/opt/model.ptc";

  torch::jit::script::Module module = torch::jit::load(model_path, torch::kCPU);
  module.eval();

  /*!
   *
   *                        INITIALIZE AWS SDK
   *                    & REGISTER REQUEST HANDLER
   *
   */

  Aws::SDKOptions options;
  Aws::InitAPI(options);
  {
    const Aws::Utils::Base64::Base64 transformer{};
    const auto handler_fn =
        [&module,
         &transformer](const aws::lambda_runtime::invocation_request &request) {
          return handler(module, transformer, request);
        };
    aws::lambda_runtime::run_handler(handler_fn);
  }
  Aws::ShutdownAPI(options);
  return 0;
}

4. Package your source with torchlambda deploy

Now we have our model and source code. It's time to deploy it as AWS Lambda ready .zip package.

Run from command line:

$ torchlambda deploy ./torchlambda --compilation "-Wall -O2"

Above will create torchlambda.zip file ready for deploy. Notice --compilation where you can pass any C++ compilation flags (here -O2 for increased performance).

There are many more things one could set during this step, check torchlambda deploy --help for full list of available options.

5. Package your model as AWS Lambda Layer

As the above source code is roughly 30Mb in size (AWS Lambda has 250Mb limit), we can put our model as additional layer. To create it run:

$ torchlambda model ./model.ptc --destination "model.zip"

You will receive model.zip layer in your current working directory (--destination is optional).

6. Deploy to AWS Lambda

From now on you could mostly follow tutorial from AWS Lambda's C++ Runtime. It is assumed you have AWS CLI configured, check Configuring the AWS CLI otherwise (or see Test Lambda deployment locally tutorial)

6.1 Create trust policy JSON file

First create the following trust policy JSON file:

$ cat trust-policy.json
{
 "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": ["lambda.amazonaws.com"]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

6.2 Create IAM role trust policy JSON file

Run from your shell:

$ aws iam create-role --role-name demo --assume-role-policy-document file://trust-policy.json

Note down the role Arn returned to you after running that command, it will be needed during next step.

6.3 Create AWS Lambda function

Create deployment function with the script below:

$ aws lambda create-function --function-name demo \
  --role <specify role arn from step 5.2 here> \
  --runtime provided --timeout 30 --memory-size 1024 \
  --handler torchlambda --zip-file fileb://torchlambda.zip

6.4 Create AWS Layer containing model

We already have our ResNet18 packed appropriately so run the following to make a layer from it:

$ aws lambda publish-layer-version --layer-name model \
  --description "Resnet18 neural network model" \
  --license-info "MIT" \
  --zip-file fileb://model.zip

Please save the LayerVersionArn just like in 6.2 and insert below to add this layer to function from previous step:

$ aws lambda update-function-configuration \
  --function-name demo \
  --layers <specify layer arn from above here>

This configures whole deployment, now we our model is ready to get incoming requests.

7. Encode image with base64 and make a request

Following script (save it as request.py) will send image-like tensor encoded using base64 via aws lambda invoke to test our function.

import argparse
import base64
import shlex
import struct
import subprocess
import sys

import numpy as np


def parse_arguments():
    parser = argparse.ArgumentParser(formatter_class=argparse.RawTextHelpFormatter)
    parser.add_argument("function-name")
    parser.add_argument("channels", type=int)
    parser.add_argument("width", type=int)
    parser.add_argument("height", type=int)
    parser.add_argument("output", type=int)

    return parser.parse_args()


def request(args):
    # Flatten to send as byte payload
    random_image = (
        np.random.randint(
            low=0, high=255, size=(1, args.channels, args.width, args.height)
        )
        .flatten()
        .tolist()
    )
    # Encode using bytes for AWS Lambda compatibility
    image = struct.pack("<{}B".format(len(data)), *data)
    # Encode as base64 string
    encoded = base64.b64encode(image)
    command = (
        """aws lambda invoke --function-name {} --payload"""
        """'{{"data": "{}", "channels": {}, "width": {}, "height": {} }}' {}""".format(
            args.function_name,
            args.channels,
            args.width,
            args.height,
            encoded,
            args.output,
        )
    )

    subprocess.call(shlex.split(command))


if __name__ == "__main__":
    args = parse_arguments()
    request(args)

Run above script:

$ python request.py demo 3 64 64 output.json

You should get the following response in output.json (your label may vary):

cat output.txt
  {"label": 40}

Congratulations, you have deployed ResNet18 classifier using only AWS Lambda in 7 steps!

Contributing

If you find issue or would like to see some functionality (or implement one), please open new Issue or create Pull Request.

Footnotes

1. Support for latest version of it's main DL framework or main frameworks if multiple supported

2. Project dependencies shape can be easily cutomized. In torchlambda case it is customizable build of libtorch and AWS C++ SDK

3. Necessary size of code and dependencies to deploy model

4. Based on Dockerfile size

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchlambda-1585885600.tar.gz (31.7 kB view details)

Uploaded Source

Built Distribution

torchlambda-1585885600-py3-none-any.whl (31.6 kB view details)

Uploaded Python 3

File details

Details for the file torchlambda-1585885600.tar.gz.

File metadata

  • Download URL: torchlambda-1585885600.tar.gz
  • Upload date:
  • Size: 31.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7

File hashes

Hashes for torchlambda-1585885600.tar.gz
Algorithm Hash digest
SHA256 e66db62ab22a44b3cc08d1d772d69174c28ce7259d09327d7fe3bf69a1c74c87
MD5 8184ec63aa6c80077649cc0402731d11
BLAKE2b-256 07bbad69815004b3159e8732e3939924762d6ec1c535d404a533c29415cef313

See more details on using hashes here.

File details

Details for the file torchlambda-1585885600-py3-none-any.whl.

File metadata

  • Download URL: torchlambda-1585885600-py3-none-any.whl
  • Upload date:
  • Size: 31.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7

File hashes

Hashes for torchlambda-1585885600-py3-none-any.whl
Algorithm Hash digest
SHA256 c5c0e3b09ec237d9e77da4aa3b92727fe3ab191dfc36dcc0bb4d00f320d272ad
MD5 c00b7d0f627968159daa5e2aa4d6896f
BLAKE2b-256 f026f4c81fd11bda5c998556a89531ffffb7e3fba6961c0b75ae0173759c9b82

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page