Python Client for TorchServe APIs
Project description
TorchServe Python Client
Install
pip install torchserve_client
Usage
Using torchserve_client
is a breeze! To get started, simply initialize
a TorchServeClient
object as shown below:
from torchserve_client import TorchServeClient
# Initialize the TorchServeClient object
ts_client = TorchServeClient()
ts_client
TorchServeClient(base_url=http://localhost, management_port=8081, inference_port=8080)
If you wish to customize the base URL, management port, or inference port of your TorchServe server, you can pass them as arguments during initialization:
from torchserve_client import TorchServeClient
# Customize the base URL, management port, and inference port
ts_client = TorchServeClient(base_url='http://your-torchserve-server.com',
management_port=8081, inference_port=8080)
ts_client
TorchServeClient(base_url=http://your-torchserve-server.com, management_port=8081, inference_port=8080)
Alternatively, if you don’t provide a base URL during initialization,
the client will check for the presence of TORCHSERVE_URL
in the
environment variables. If the variable is not found, it will gracefully
fall back to using localhost as the default. This way, you have the
flexibility to tailor your TorchServeClient to your needs effortlessly!
Happy serving! 🍿🔥
Management APIs
With TorchServe Management APIs, you can effortlessly manage your models
at runtime. Here’s a quick rundown of the actions you can perform using
our TorchServeClient
SDK:
- Register a Model: Easily register a model with TorchServe using
the
ts_client.management.register_model()
method.
ts_client.management.register_model('https://torchserve.pytorch.org/mar_files/squeezenet1_1.mar')
{'status': 'Model "squeezenet1_1" Version: 1.0 registered with 0 initial workers. Use scale workers API to add workers for the model.'}
- Increase/Decrease Workers: Scale the number of workers for a
specific model with simplicity using
ts_client.management.scale_workers()
.
ts_client.management.scale_workers('squeezenet1_1', min_worker=1, max_worker=2)
{'status': 'Processing worker updates...'}
- Model Status: Curious about a model’s status? Fetch all the
details you need using
ts_client.management.describe_model()
.
ts_client.management.describe_model('squeezenet1_1')
[{'modelName': 'squeezenet1_1',
'modelVersion': '1.0',
'modelUrl': 'https://torchserve.pytorch.org/mar_files/squeezenet1_1.mar',
'runtime': 'python',
'minWorkers': 1,
'maxWorkers': 1,
'batchSize': 1,
'maxBatchDelay': 100,
'loadedAtStartup': False,
'workers': [{'id': '9001',
'startTime': '2023-07-17T22:55:40.155Z',
'status': 'UNLOADING',
'memoryUsage': 0,
'pid': -1,
'gpu': False,
'gpuUsage': 'N/A'}]}]
- List Registered Models: Quickly fetch a list of all registered
models using
ts_client.management.list_models()
.
# List all models
ts_client.management.list_models()
{'models': [{'modelName': 'squeezenet1_1',
'modelUrl': 'https://torchserve.pytorch.org/mar_files/squeezenet1_1.mar'}]}
- Set Default Model Version: Ensure the desired version of a model
is the default choice with the
ts_client.management.set_model_version()
method.
ts_client.management.set_default_version('squeezenet1_1', '1.0')
{'status': 'Default vesion succsesfully updated for model "squeezenet1_1" to "1.0"'}
- Unregister a Model: If you need to bid farewell to a model, use
the
ts_client.management.unregister_model()
function to gracefully remove it from TorchServe.
ts_client.management.unregister_model('squeezenet1_1')
{'status': 'Model "squeezenet1_1" unregistered'}
- API Description: view a full list of Managment APIs.
ts_client.management.api_description()
Remember, all these management APIs can be accessed conveniently under
the namespace ts_client.management
.
Inference APIs
TorchServeClient allows you to interact with the Inference API, which
listens on port 8080, enabling you to run inference on your samples
effortlessly. Here are the available APIs under the
ts_client.inference
namespace:
- API Description: Want to explore what APIs and options are
available? Use
ts_client.inference.api_description()
to get a comprehensive list.
ts_client.inference.api_description()
{'openapi': '3.0.1',
'info': {'title': 'TorchServe APIs',
'description': 'TorchServe is a flexible and easy to use tool for serving deep learning models',
'version': '0.8.1'},
'paths': {'/': {'options': {'description': 'Get openapi description.',
'operationId': 'apiDescription',
'parameters': [],
'responses': {'200': {'description': 'A openapi 3.0.1 descriptor',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['openapi', 'info', 'paths'],
'properties': {'openapi': {'type': 'string'},
'info': {'type': 'object'},
'paths': {'type': 'object'}}}}}},
'500': {'description': 'Internal Server Error',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string',
'description': 'Error message.'}}}}}}}}},
'/ping': {'get': {'description': 'Get TorchServe status.',
'operationId': 'ping',
'parameters': [],
'responses': {'200': {'description': 'TorchServe status',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['status'],
'properties': {'status': {'type': 'string',
'description': 'Overall status of the TorchServe.'}}}}}},
'500': {'description': 'Internal Server Error',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string',
'description': 'Error message.'}}}}}}}}},
'/v1/models/{model_name}:predict': {'post': {'description': 'Predictions entry point to get inference using default model version.',
'operationId': 'predictions',
'parameters': [{'in': 'path',
'name': 'model_name',
'description': 'Name of model.',
'required': True,
'schema': {'type': 'string'}}],
'requestBody': {'description': 'Input data format is defined by each model.',
'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}},
'required': True},
'responses': {'200': {'description': 'Output data format is defined by each model.',
'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}}},
'404': {'description': 'Model not found or Model Version not found',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string', 'description': 'Error message.'}}}}}},
'500': {'description': 'Internal Server Error',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string', 'description': 'Error message.'}}}}}},
'503': {'description': 'No worker is available to serve request',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string',
'description': 'Error message.'}}}}}}}}},
'/v2/models/{model_name}/infer': {'post': {'description': 'Predictions entry point to get inference using default model version.',
'operationId': 'predictions',
'parameters': [{'in': 'path',
'name': 'model_name',
'description': 'Name of model.',
'required': True,
'schema': {'type': 'string'}}],
'requestBody': {'description': 'Input data format is defined by each model.',
'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}},
'required': True},
'responses': {'200': {'description': 'Output data format is defined by each model.',
'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}}},
'404': {'description': 'Model not found or Model Version not found',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string', 'description': 'Error message.'}}}}}},
'500': {'description': 'Internal Server Error',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string', 'description': 'Error message.'}}}}}},
'503': {'description': 'No worker is available to serve request',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string',
'description': 'Error message.'}}}}}}}}},
'/predictions/{model_name}': {'post': {'description': 'Predictions entry point to get inference using default model version.',
'operationId': 'predictions',
'parameters': [{'in': 'path',
'name': 'model_name',
'description': 'Name of model.',
'required': True,
'schema': {'type': 'string'}}],
'requestBody': {'description': 'Input data format is defined by each model.',
'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}},
'required': True},
'responses': {'200': {'description': 'Output data format is defined by each model.',
'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}}},
'404': {'description': 'Model not found or Model Version not found',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string', 'description': 'Error message.'}}}}}},
'500': {'description': 'Internal Server Error',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string', 'description': 'Error message.'}}}}}},
'503': {'description': 'No worker is available to serve request',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string',
'description': 'Error message.'}}}}}}}}},
'/predictions/{model_name}/{model_version}': {'post': {'description': 'Predictions entry point to get inference using specific model version.',
'operationId': 'version_predictions',
'parameters': [{'in': 'path',
'name': 'model_name',
'description': 'Name of model.',
'required': True,
'schema': {'type': 'string'}},
{'in': 'path',
'name': 'model_version',
'description': 'Name of model version.',
'required': True,
'schema': {'type': 'string'}}],
'requestBody': {'description': 'Input data format is defined by each model.',
'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}},
'required': True},
'responses': {'200': {'description': 'Output data format is defined by each model.',
'content': {'*/*': {'schema': {'type': 'string', 'format': 'binary'}}}},
'404': {'description': 'Model not found or Model Version not found',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string', 'description': 'Error message.'}}}}}},
'500': {'description': 'Internal Server Error',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string', 'description': 'Error message.'}}}}}},
'503': {'description': 'No worker is available to serve request',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string',
'description': 'Error message.'}}}}}}}}},
'/api-description': {'get': {'description': 'Get openapi description.',
'operationId': 'api-description',
'parameters': [],
'responses': {'200': {'description': 'A openapi 3.0.1 descriptor',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['openapi', 'info', 'paths'],
'properties': {'openapi': {'type': 'string'},
'info': {'type': 'object'},
'paths': {'type': 'object'}}}}}},
'500': {'description': 'Internal Server Error',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string', 'description': 'Error message.'}}}}}}},
'deprecated': True}},
'/metrics': {'get': {'description': 'Get TorchServe application metrics in prometheus format.',
'operationId': 'metrics',
'parameters': [{'in': 'query',
'name': 'name[]',
'description': 'Names of metrics to filter',
'required': False,
'schema': {'type': 'string'}}],
'responses': {'200': {'description': 'TorchServe application metrics',
'content': {'text/plain; version=0.0.4; charset=utf-8': {'schema': {'type': 'object',
'required': ['# HELP', '# TYPE', 'metric'],
'properties': {'# HELP': {'type': 'string',
'description': 'Help text for TorchServe metric.'},
'# TYPE': {'type': 'string',
'description': 'Type of TorchServe metric.'},
'metric': {'type': 'string',
'description': 'TorchServe application metric.'}}}}}},
'500': {'description': 'Internal Server Error',
'content': {'application/json': {'schema': {'type': 'object',
'required': ['code', 'type', 'message'],
'properties': {'code': {'type': 'integer',
'description': 'Error code.'},
'type': {'type': 'string', 'description': 'Error type.'},
'message': {'type': 'string',
'description': 'Error message.'}}}}}}}}}}}
- Health Check API: Ensure the health of the running server with
the
ts_client.inference.health_check()
method.
ts_client.inference.health_check()
{'status': 'Healthy'}
- Predictions API: Get predictions from the served model using
ts_client.inference.predictions()
.
ts_client.inference.prediction('squeezenet1_1', data={'data': open('/Users/ankursingh/Downloads/kitten_small.jpg', 'rb')})
{'lynx': 0.5455798506736755,
'tabby': 0.2794159948825836,
'Egyptian_cat': 0.10391879826784134,
'tiger_cat': 0.06263326108455658,
'leopard': 0.0050191376358270645}
- Explanations API: Dive into the served model’s explanations with
ease using
ts_client.inference.explanations()
.
ts_client.inference.explaination('squeezenet1_1', data={'data': open('/Users/ankursingh/Downloads/kitten_small.jpg', 'rb')})
With these intuitive APIs at your disposal, you can harness the full power of the Management and Inference API and take your application to next level. Happy inferencing! 🚀🔥
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for torchserve_client-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35a7ce2c9f42a0ab8de85c98081caf368f614e18c5e60db71d817e9fb73203a2 |
|
MD5 | bb2c2f031fbc95060209cb810dd8ea9b |
|
BLAKE2b-256 | 7dcb138a4e9788548f58915d048839acc547ceee2a9864b07cc1469fd3560927 |