Skip to main content

Open source library for running inference workload with Hugging Face Deep Learning Containers on Amazon SageMaker.

Project description

SageMaker Hugging Face Inference Toolkit

Latest Version Supported Python Versions Code Style: Black

SageMaker Hugging Face Inference Toolkit is an open-source library for serving 🤗 Transformers models on Amazon SageMaker. This library provides default pre-processing, predict and postprocessing for certain 🤗 Transformers models and tasks. It utilizes the SageMaker Inference Toolkit for starting up the model server, which is responsible for handling inference requests.

For Training, see Run training on Amazon SageMaker.

For the Dockerfiles used for building SageMaker Hugging Face Containers, see AWS Deep Learning Containers.

For information on running Hugging Face jobs on Amazon SageMaker, please refer to the 🤗 Transformers documentation.

For notebook examples: SageMaker Notebook Examples.


💻 Getting Started with 🤗 Inference Toolkit

needs to be adjusted -> currently pseudo code

Install Amazon SageMaker Python SDK

pip install sagemaker --upgrade

Create a Amazon SageMaker endpoint with a trained model.

from sagemaker.huggingface import HuggingFaceModel

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
    transformers_version='4.6',
    pytorch_version='1.7',
    py_version='py36',
    model_data='s3://my-trained-model/artifcats/model.tar.gz',
    role=role,
)
# deploy model to SageMaker Inference
huggingface_model.deploy(initial_instance_count=1,instance_type="ml.m5.xlarge")

Create a Amazon SageMaker endpoint with a model from the 🤗 Hub.
note: This is an experimental feature, where the model will be loaded after the endpoint is created. Not all sagemaker features are supported, e.g. MME

from sagemaker.huggingface import HuggingFaceModel
# Hub Model configuration. https://huggingface.co/models
hub = {
  'HF_MODEL_ID':'distilbert-base-uncased-distilled-squad',
  'HF_TASK':'question-answering'
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
    transformers_version='4.6',
    pytorch_version='1.7',
    py_version='py36',
    env=hub,
    role=role,
)
# deploy model to SageMaker Inference
huggingface_model.deploy(initial_instance_count=1,instance_type="ml.m5.xlarge")

🛠️ Environment variables

The SageMaker Hugging Face Inference Toolkit implements various additional environment variables to simplify your deployment experience. A full list of environment variables is given below.

HF_TASK

The HF_TASK environment variable defines the task for the used 🤗 Transformers pipeline. A full list of tasks can be find here.

HF_TASK="question-answering"

HF_MODEL_ID

The HF_MODEL_ID environment variable defines the model id, which will be automatically loaded from huggingface.co/models when creating or SageMaker Endpoint. The 🤗 Hub provides +10 000 models all available through this environment variable.

HF_MODEL_ID="distilbert-base-uncased-finetuned-sst-2-english"

HF_MODEL_REVISION

The HF_MODEL_REVISION is an extension to HF_MODEL_ID and allows you to define/pin a revision of the model to make sure you always load the same model on your SageMaker Endpoint.

HF_MODEL_REVISION="03b4d196c19d0a73c7e0322684e97db1ec397613"

HF_API_TOKEN

The HF_API_TOKEN environment variable defines the your Hugging Face authorization token. The HF_API_TOKEN is used as a HTTP bearer authorization for remote files, like private models. You can find your token at your settings page.

HF_API_TOKEN="api_XXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

🧑🏻‍💻 User defined code/modules

The Hugging Face Inference Toolkit allows user to override the default methods of the HuggingFaceHandlerService. Therefor the need to create a named code/ with a inference.py file in it. For example:

model.tar.gz/
|- pytroch_model.bin
|- ....
|- code/
  |- inference.py
  |- requirements.txt 

In this example, pytroch_model.bin is the model file saved from training, inference.py is the custom inference module, and requirements.txt is a requirements file to add additional dependencies. The custom module can override the following methods:

  • model_fn(model_dir): overrides the default method for loading the model, the return value model will be used in the predict() for predicitions. It receives argument the model_dir, the path to your unzipped model.tar.gz.
  • transform_fn(model, data, content_type, accept_type): Overrides the default transform function with custom implementation. Customers using this would have to implement preprocess, predict and postprocess steps in the transform_fn. NOTE: This method can't be combined with input_fn, predict_fn or output_fn mentioned below.
  • input_fn(input_data, content_type): overrides the default method for prerprocessing, the return value data will be used in the predict() method for predicitions. The input is input_data, the raw body of your request and content_type, the content type form the request Header.
  • predict_fn(processed_data, model): overrides the default method for predictions, the return value predictions will be used in the postprocess() method. The input is processed_data, the result of the preprocess() method.
  • output_fn(prediction, accept): overrides the default method for postprocessing, the return value result will be the respond of your request(e.g.JSON). The inputs are predictions, the result of the predict() method and accept the return accept type from the HTTP Request, e.g. application/json

🤝 Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.


📜 License

SageMaker Hugging Face Inference Toolkit is licensed under the Apache 2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file sagemaker-huggingface-inference-toolkit-1.1.1.tar.gz.

File metadata

  • Download URL: sagemaker-huggingface-inference-toolkit-1.1.1.tar.gz
  • Upload date:
  • Size: 20.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.2 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for sagemaker-huggingface-inference-toolkit-1.1.1.tar.gz
Algorithm Hash digest
SHA256 e35b61aa3a118e9068b41bc05157c377a94dddf237e175f62fb75ae1c92c4202
MD5 d38a053a9398c55d3d7dd916bea8186a
BLAKE2b-256 97e79529316131e25051c2b7892011f125ef5d0b2438077993d41ef23f9925dd

See more details on using hashes here.

File details

Details for the file sagemaker_huggingface_inference_toolkit-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for sagemaker_huggingface_inference_toolkit-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5e404e150abd6946124afc7fe657c8051549289baa3d3a79d60abf2d36b9a86d
MD5 9ddff462b196a80da9b698c46f331c34
BLAKE2b-256 90a298c8dbfcf4541bb0b4af0001629ef521d4f028d3c31637afb8af10468975

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page