Skip to main content

Python Client for TorchServe APIs

Project description

TorchServe Python Client

Install

pip install torchserve_client

Usage

Using torchserve_client is a breeze! To get started, simply initialize a TorchServeClient object as shown below:

from torchserve_client import TorchServeClient

# Initialize the TorchServeClient object
ts_client = TorchServeClient()
ts_client
TorchServeClient(base_url=http://localhost, management_port=8081, inference_port=8080)

If you wish to customize the base URL, management port, or inference port of your TorchServe server, you can pass them as arguments during initialization:

from torchserve_client import TorchServeClient

# Customize the base URL, management port, and inference port
ts_client = TorchServeClient(base_url='http://your-torchserve-server.com', 
                             management_port=8081, inference_port=8080)
ts_client
TorchServeClient(base_url=http://your-torchserve-server.com, management_port=8081, inference_port=8080)

Alternatively, if you don’t provide a base URL during initialization, the client will check for the presence of TORCHSERVE_URL in the environment variables. If the variable is not found, it will gracefully fall back to using localhost as the default. This way, you have the flexibility to tailor your TorchServeClient to your needs effortlessly! Happy serving! 🍿🔥

Management APIs

With TorchServe Management APIs, you can effortlessly manage your models at runtime. Here’s a quick rundown of the actions you can perform using our TorchServeClient SDK:

  1. Register a Model: Easily register a model with TorchServe using the ts_client.management.register_model() method.
ts_client.management.register_model('https://torchserve.pytorch.org/mar_files/squeezenet1_1.mar')
{'status': 'Model "squeezenet1_1" Version: 1.0 registered with 0 initial workers. Use scale workers API to add workers for the model.'}
  1. Increase/Decrease Workers: Scale the number of workers for a specific model with simplicity using ts_client.management.scale_workers().
ts_client.management.scale_workers('squeezenet1_1', min_worker=1, max_worker=2)
{'status': 'Processing worker updates...'}
  1. Model Status: Curious about a model’s status? Fetch all the details you need using ts_client.management.describe_model().
ts_client.management.describe_model('squeezenet1_1')
[{'modelName': 'squeezenet1_1',
  'modelVersion': '1.0',
  'modelUrl': 'https://torchserve.pytorch.org/mar_files/squeezenet1_1.mar',
  'runtime': 'python',
  'minWorkers': 1,
  'maxWorkers': 1,
  'batchSize': 1,
  'maxBatchDelay': 100,
  'loadedAtStartup': False,
  'workers': [{'id': '9001',
    'startTime': '2023-07-17T22:55:40.155Z',
    'status': 'UNLOADING',
    'memoryUsage': 0,
    'pid': -1,
    'gpu': False,
    'gpuUsage': 'N/A'}]}]
  1. List Registered Models: Quickly fetch a list of all registered models using ts_client.management.list_models().
# List all models
ts_client.management.list_models()
{'models': [{'modelName': 'squeezenet1_1',
   'modelUrl': 'https://torchserve.pytorch.org/mar_files/squeezenet1_1.mar'}]}
  1. Set Default Model Version: Ensure the desired version of a model is the default choice with the ts_client.management.set_model_version() method.
ts_client.management.set_default_version('squeezenet1_1', '1.0')
{'status': 'Default vesion succsesfully updated for model "squeezenet1_1" to "1.0"'}
  1. Unregister a Model: If you need to bid farewell to a model, use the ts_client.management.unregister_model() function to gracefully remove it from TorchServe.
ts_client.management.unregister_model('squeezenet1_1')
{'status': 'Model "squeezenet1_1" unregistered'}
  1. API Description: view a full list of Managment APIs.
ts_client.management.api_description()

Remember, all these management APIs can be accessed conveniently under the namespace ts_client.management.

Inference APIs

TorchServeClient allows you to interact with the Inference API, which listens on port 8080, enabling you to run inference on your samples effortlessly. Here are the available APIs under the ts_client.inference namespace:

  1. API Description: Want to explore what APIs and options are available? Use ts_client.inference.api_description() to get a comprehensive list.
ts_client.inference.api_description()
{'openapi': '3.0.1',
 'info': {'title': 'TorchServe APIs',
  'description': 'TorchServe is a flexible and easy to use tool for serving deep learning models',
  'version': '0.8.1'},
 'paths': {'/': {'options': {'description': 'Get openapi description.',
    'operationId': 'apiDescription',
    'parameters': [],
    'responses': {'200': {'description': 'A openapi 3.0.1 descriptor',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['openapi', 'info', 'paths'],
         'properties': {'openapi': {'type': 'string'},
          'info': {'type': 'object'},
          'paths': {'type': 'object'}}}}}},
     '500': {'description': 'Internal Server Error',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string',
           'description': 'Error message.'}}}}}}}}},
  '/ping': {'get': {'description': 'Get TorchServe status.',
    'operationId': 'ping',
    'parameters': [],
    'responses': {'200': {'description': 'TorchServe status',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['status'],
         'properties': {'status': {'type': 'string',
           'description': 'Overall status of the TorchServe.'}}}}}},
     '500': {'description': 'Internal Server Error',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string',
           'description': 'Error message.'}}}}}}}}},
  '/v1/models/{model_name}:predict': {'post': {'description': 'Predictions entry point to get inference using default model version.',
    'operationId': 'predictions',
    'parameters': [{'in': 'path',
      'name': 'model_name',
      'description': 'Name of model.',
      'required': True,
      'schema': {'type': 'string'}}],
    'requestBody': {'description': 'Input data format is defined by each model.',
     'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}},
     'required': True},
    'responses': {'200': {'description': 'Output data format is defined by each model.',
      'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}}},
     '404': {'description': 'Model not found or Model Version not found',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string', 'description': 'Error message.'}}}}}},
     '500': {'description': 'Internal Server Error',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string', 'description': 'Error message.'}}}}}},
     '503': {'description': 'No worker is available to serve request',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string',
           'description': 'Error message.'}}}}}}}}},
  '/v2/models/{model_name}/infer': {'post': {'description': 'Predictions entry point to get inference using default model version.',
    'operationId': 'predictions',
    'parameters': [{'in': 'path',
      'name': 'model_name',
      'description': 'Name of model.',
      'required': True,
      'schema': {'type': 'string'}}],
    'requestBody': {'description': 'Input data format is defined by each model.',
     'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}},
     'required': True},
    'responses': {'200': {'description': 'Output data format is defined by each model.',
      'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}}},
     '404': {'description': 'Model not found or Model Version not found',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string', 'description': 'Error message.'}}}}}},
     '500': {'description': 'Internal Server Error',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string', 'description': 'Error message.'}}}}}},
     '503': {'description': 'No worker is available to serve request',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string',
           'description': 'Error message.'}}}}}}}}},
  '/predictions/{model_name}': {'post': {'description': 'Predictions entry point to get inference using default model version.',
    'operationId': 'predictions',
    'parameters': [{'in': 'path',
      'name': 'model_name',
      'description': 'Name of model.',
      'required': True,
      'schema': {'type': 'string'}}],
    'requestBody': {'description': 'Input data format is defined by each model.',
     'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}},
     'required': True},
    'responses': {'200': {'description': 'Output data format is defined by each model.',
      'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}}},
     '404': {'description': 'Model not found or Model Version not found',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string', 'description': 'Error message.'}}}}}},
     '500': {'description': 'Internal Server Error',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string', 'description': 'Error message.'}}}}}},
     '503': {'description': 'No worker is available to serve request',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string',
           'description': 'Error message.'}}}}}}}}},
  '/predictions/{model_name}/{model_version}': {'post': {'description': 'Predictions entry point to get inference using specific model version.',
    'operationId': 'version_predictions',
    'parameters': [{'in': 'path',
      'name': 'model_name',
      'description': 'Name of model.',
      'required': True,
      'schema': {'type': 'string'}},
     {'in': 'path',
      'name': 'model_version',
      'description': 'Name of model version.',
      'required': True,
      'schema': {'type': 'string'}}],
    'requestBody': {'description': 'Input data format is defined by each model.',
     'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}},
     'required': True},
    'responses': {'200': {'description': 'Output data format is defined by each model.',
      'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}}},
     '404': {'description': 'Model not found or Model Version not found',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string', 'description': 'Error message.'}}}}}},
     '500': {'description': 'Internal Server Error',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string', 'description': 'Error message.'}}}}}},
     '503': {'description': 'No worker is available to serve request',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string',
           'description': 'Error message.'}}}}}}}}},
  '/api-description': {'get': {'description': 'Get openapi description.',
    'operationId': 'api-description',
    'parameters': [],
    'responses': {'200': {'description': 'A openapi 3.0.1 descriptor',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['openapi', 'info', 'paths'],
         'properties': {'openapi': {'type': 'string'},
          'info': {'type': 'object'},
          'paths': {'type': 'object'}}}}}},
     '500': {'description': 'Internal Server Error',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string', 'description': 'Error message.'}}}}}}},
    'deprecated': True}},
  '/metrics': {'get': {'description': 'Get TorchServe application metrics in prometheus format.',
    'operationId': 'metrics',
    'parameters': [{'in': 'query',
      'name': 'name[]',
      'description': 'Names of metrics to filter',
      'required': False,
      'schema': {'type': 'string'}}],
    'responses': {'200': {'description': 'TorchServe application metrics',
      'content': {'text/plain; version=0.0.4; charset=utf-8': {'schema': {'type': 'object',
         'required': ['# HELP', '# TYPE', 'metric'],
         'properties': {'# HELP': {'type': 'string',
           'description': 'Help text for TorchServe metric.'},
          '# TYPE': {'type': 'string',
           'description': 'Type of TorchServe metric.'},
          'metric': {'type': 'string',
           'description': 'TorchServe application metric.'}}}}}},
     '500': {'description': 'Internal Server Error',
      'content': {'application/json': {'schema': {'type': 'object',
         'required': ['code', 'type', 'message'],
         'properties': {'code': {'type': 'integer',
           'description': 'Error code.'},
          'type': {'type': 'string', 'description': 'Error type.'},
          'message': {'type': 'string',
           'description': 'Error message.'}}}}}}}}}}}
  1. Health Check API: Ensure the health of the running server with the ts_client.inference.health_check() method.
ts_client.inference.health_check()
{'status': 'Healthy'}
  1. Predictions API: Get predictions from the served model using ts_client.inference.predictions().
ts_client.inference.prediction('squeezenet1_1', data={'data': open('/Users/ankursingh/Downloads/kitten_small.jpg', 'rb')})
{'lynx': 0.5455798506736755,
 'tabby': 0.2794159948825836,
 'Egyptian_cat': 0.10391879826784134,
 'tiger_cat': 0.06263326108455658,
 'leopard': 0.0050191376358270645}
  1. Explanations API: Dive into the served model’s explanations with ease using ts_client.inference.explanations().
ts_client.inference.explaination('squeezenet1_1', data={'data': open('/Users/ankursingh/Downloads/kitten_small.jpg', 'rb')})

With these intuitive APIs at your disposal, you can harness the full power of the Management and Inference API and take your application to next level. Happy inferencing! 🚀🔥

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchserve_client-0.0.1.tar.gz (15.9 kB view hashes)

Uploaded Source

Built Distribution

torchserve_client-0.0.1-py3-none-any.whl (14.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page