Skip to main content

Model serving at scale

Project description

Cortex makes it simple to deploy machine learning models in production.

Deploy

  • Deploy TensorFlow, PyTorch, ONNX, scikit-learn, and other models.
  • Define preprocessing and postprocessing steps in Python.
  • Configure APIs as realtime or batch.
  • Deploy multiple models per API.

Manage

  • Monitor API performance and track predictions.
  • Update APIs with no downtime.
  • Stream logs from APIs.
  • Perform A/B tests.

Scale

  • Test locally, scale on your AWS account.
  • Autoscale to handle production traffic.
  • Reduce cost with spot instances.

documentationtutorialexampleschat with us

Install the CLI

pip install cortex

You must have Docker installed to run Cortex locally or to create a cluster on AWS.

Deploy an example

# clone the Cortex repository
git clone -b master https://github.com/cortexlabs/cortex.git

Using the CLI

# deploy the model as a realtime api
cortex deploy cortex/examples/pytorch/text-generator/cortex.yaml

# view the status of the api
cortex get --watch

# stream logs from the api
cortex logs text-generator

# get the api's endpoint
cortex get text-generator

# generate text
curl <API endpoint> \
  -X POST -H "Content-Type: application/json" \
  -d '{"text": "machine learning is"}' \

# delete the api
cortex delete text-generator

In Python

import cortex
import requests

local_client = cortex.client("local")

# deploy the model as a realtime api and wait for it to become active
deployments = local_client.deploy("cortex/examples/pytorch/text-generator/cortex.yaml", wait=True)

# get the api's endpoint
url = deployments[0]["api"]["endpoint"]

# generate text
print(requests.post(url, json={"text": "machine learning is"}).text)

# delete the api
local_client.delete_api("text-generator")

Running at scale on AWS

Run the command below to create a cluster with basic configuration, or see cluster configuration to learn how you can customize your cluster with cluster.yaml.

See EC2 instances for an overview of several EC2 instance types. To use GPU nodes, you may need to subscribe to the EKS-optimized AMI with GPU Support and file an AWS support ticket to increase the limit for your desired instance type.

# create a Cortex cluster on your AWS account
cortex cluster up

# set the default CLI environment (optional)
cortex env default aws

You can now run the same commands shown above to deploy the text generator to AWS (if you didn't set the default CLI environment, add --env aws to the cortex commands).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cortex-deploy-0.20.0.dev0.tar.gz (7.0 kB view details)

Uploaded Source

File details

Details for the file cortex-deploy-0.20.0.dev0.tar.gz.

File metadata

  • Download URL: cortex-deploy-0.20.0.dev0.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.6.9

File hashes

Hashes for cortex-deploy-0.20.0.dev0.tar.gz
Algorithm Hash digest
SHA256 ff9b9c9c56aaee8441a9ae751c72e8398177fb633f1dc31f5db973e374984225
MD5 c82e9e8c1a7499adad6dd7be8315fa90
BLAKE2b-256 d758ca93bacd1188114789e41951f295a25d116ea286484251e56a488eb4bb16

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page