Skip to main content

No project description provided

Project description

pyraisdk

Overview

AML models are meant to be deployed to GPU instances to provide inference service. If the code that operates the model uses the GPU for inferencing in each request separately, the overall performance of the model will be quite inefficent. This SDK has APIs that can allocate batches of inference requests to run on the GPU in a separate thread, thereby considerably improving the usage efficiency of GPU and making the model more performant.

The SDK also collects telemetry data for each so the performance of the model can be evaluated and tracked and provides logging primitives that can be used to produce additional troubleshooting information.

How to find Logs and Metrics

For AML models deployed with CD pipeline, Logs generated by pyraisdk will be found in AuxUnstructuredEventTable and Metrics in AuxStructuredEventTable. More details please refer to raidri.

From remote model dashboard, we can see below data from pyraisdk and debug model side metrics. From there you can explore Queries and either play with the matviews more or else look at the definition of the matviews and thus see how those queries are done.

Using pyraisdk

Integrating with Model

Follow the steps below to integrate pyraisdk.

  1. In environment configuration, list latest pyraisdk version as a dependency.
  2. Import resources from pyraisdk
from pyraisdk import rlog
from pyraisdk.dynbatch import BaseModel, DynamicBatchModel
  1. Implement BaseModel class, which includes methods preprocess() and predict(), to define how data must be preprocessed and how inferences are made.
  2. At process start (e.g. in init() function if deploying with scoring script), initialize logging, instantiate object of class that implements BaseModel, and define global variable for batched model as shown below.
rlog.initialize()

global batched_model
malware_model = MalwareModel()
batched_model = DynamicBatchModel(malware_model)
  1. In request handler (e.g. in run() function if deploying with scoring script), pass list of requests to the batched model's predict() method and return results. Under-the-hood, pyraisdk creates batches and uses your model's preprocess() and predict() methods to generate results.
return batched_model.predict(json.loads(request_data)['data'])
  1. Optionally log additional structured and unstructured events using EventLogger methods. It is not necessary to log latency information, as pyraisdk does so automatically.

Configuring CD Pipeline

In addition to following the CD pipeline's developer guidelines to onboard your model to RAI's deployment infrastructure, you must configure BatchingConfig for each deployment target to which your model will be deployed. To do so,

  1. Open the appropriate configuration file under deployment-target-configs/../*.json
  2. Configure the model's Version, InstanceType / SKU, and BatchingConfig as shown below
{
    "Name": "MalwareNeural",
    "Version": 19,
    "InstanceType": "Standard_NC6s_v3",
    "BatchingConfig": {
        "MaxBatchSize": 12,
        "IdleBatchSize": 5,
        "MaxBatchInterval": 0.002
    }
}

When the CD pipeline deploys the model to an AML online deployment, it exports environment variables for each BatchingConfig field, which pyraisdk utilizes at process start to configure batching parameters. The CD pipeline also exports environment variables to enable pyraisdk to log to RAI's Azure Data Explorer clusters.

Test or Deploy without RAI CD Pipeline

Test or Deploy independetly outside of RAI CD Pipeline, it's required to set environment variables for BatchingConfig manually before object of DynamicBatchModel is created. Refer to Batching Parameter.

The recommended way to set these variables is calling os.environ.setdefault in init(). Like below:

def init():
    os.environ.setdefault('PYRAISDK_MAX_BATCH_SIZE', '12')
    os.environ.setdefault('PYRAISDK_IDLE_BATCH_SIZE', '5')
    os.environ.setdefault('PYRAISDK_MAX_BATCH_INTERVAL', '0.002')
    ...
    batch_model = DynamicBatchModel(malware_model)

This should work well independent of RAI CD Pipeline. And you don't need to specifically remove these 3 lines when switching to deploy in CD Pipeline.

But please note that this kind of code should be Avoided: os.environ['PYRAISDK_MAX_BATCH_SIZE'] = '12'. It will override the configuration from CD pipeline, which is not desirable in most cases.

To enable log publishing (to eventhub), there are another several environment variables need to be set, refer to following logging part. It's optional.

Dynamic Batching Support

There are APIs you must implement in your model to support batching of inference requests for best model performance. Those APIs allow the SDK to distribute load efficiently to the GPU instances. The APIs are:

  • preprocess Modifies the input to the model, if necessary. For example, if your model needs the input in a special JSON format instead of as a list of strings, you can do that modification in the preprocess method.
  • predict Executes the model inference for a list of input strings

Batching Parameter (Attention)

Batching parameters are mandatory and should be ready before DynamicBatchModel is created. They are set through environment variables:

  • PYRAISDK_MAX_BATCH_SIZE (int): Max size of each processing batch.
  • PYRAISDK_IDLE_BATCH_SIZE (int): If there's no more data in queue, a new batch will be launched when size reaches this value.
  • PYRAISDK_MAX_BATCH_INTERVAL (float): Max interval in seconds to wait for items. When waiting time exceeds, will launch a batch immediately.

Usage Examples

Build YourModel class inherited from pyraisdk.dynbatch.BaseModel.

from typing import List
from pyraisdk.dynbatch import BaseModel

class YourModel(BaseModel):
    def predict(self, items: List[str]) -> List[int]:
        rs = []
        for item in items:
            rs.append(len(item))
        return rs
            
    def preprocess(self, items: List[str]) -> List[str]:
        rs = []
        for item in items:
            rs.append(f'[{item}]')
        return rs

Initialize a pyraisdk.dynbatch.DynamicBatchModel with YourModel instance, and call predict / predict_one for inferencing.

from pyraisdk.dynbatch import DynamicBatchModel

# prepare model
simple_model = YourModel()
batch_model = DynamicBatchModel(simple_model)

# predict
items = ['abc', '123456', 'xyzcccffaffaaa']
predictions = batch_model.predict(items)
assert predictions == [5, 8, 16]

# predict_one
item = 'abc'
prediction = batch_model.predict_one(item)
assert prediction == 5

Concurrent requests to predict / predict_one, in different threads.

from threading import Thread
from pyraisdk.dynbatch import DynamicBatchModel

# prepare model
simple_model = YourModel()
batch_model = DynamicBatchModel(simple_model)

# thread run function
def run(name, num):
    for step in range(num):
        item = f'{name}-{step}'
        prediction = batch_model.predict_one(item)
        assert prediction == len(item) + 2

# start concurrent inference
threads = [Thread(target=run, args=(f'{tid}', 100)) for tid in range(20)]
for t in threads:
    t.start()
for t in threads:
    t.join()

Loging & Events

Description

This module is for logging and event tracing.

interface

def initialize(
    eh_hostname: Optional[str] = None,
    client_id: Optional[str] = None,
    eh_conn_str: Optional[str] = None,
    eh_structured: Optional[str] = None,
    eh_unstructured: Optional[str] = None,
    role: Optional[str] = None,
    instance: Optional[str] = None,
    sys_metrics_enable: bool = True,
)

Parameter description for initialize:

  • eh_hostname: Fully Qualified Namespace aka EH Endpoint URL (*.servicebus.windows.net). Default, read ${EVENTHUB_NAMESPACE}.servicebus.windows.net
  • client_id: client_id of service principal. Default, read $UAI_CLIENT_ID
  • eh_conn_str: connection string of eventhub namespace. Default, read $EVENTHUB_CONN_STRING
  • eh_structured: structured eventhub name. Default, read $EVENTHUB_AUX_STRUCTURED
  • eh_unstructured: unstructured eventhub name. Default, read $EVENTHUB_AUX_UNSTRUCTURED
  • role: role, Default: RemoteModel
  • instance: instance, Default: ${MODEL_NAME}|${ENDPOINT_VERSION}|{hostname} or ${MODEL_NAME}|${ENDPOINT_VERSION}|{_probably_unique_id()}
  • sys_metrics_enable: Whether to enable auto metrics reporting periodically for system info like gpu, memory and gpu. Default: True
def event(self, key: str, code: str, numeric: float, detail: str='', corr_id: str='', elem: int=-1)
def infof(self, format: str, *args: Any)
def infocf(self, corr_id: str, elem: int, format: str, *args: Any)
def warnf(self, format: str, *args: Any)
def warncf(self, corr_id: str, elem: int, format: str, *args: Any)
def errorf(self, format: str, *args: Any)
def errorcf(self, corr_id: str, elem: int, ex: Optional[Exception], format: str, *args: Any)
def fatalf(self, format: str, *args: Any)
def fatalcf(self, corr_id: str, elem: int, ex: Optional[Exception], format: str, *args: Any)

examples

# export EVENTHUB_AUX_UNSTRUCTURED='ehunstruct'
# export EVENTHUB_AUX_STRUCTURED='ehstruct'
# export UAI_CLIENT_ID='xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'
# export EVENTHUB_NAMESPACE='raieusdev-eh-namespace'

from pyraisdk import rlog
rlog.initialize()

rlog.infof('this is a info message %s', 123)
rlog.event('LifetimeEvent', 'STOP_GRACEFUL_SIGNAL', 0, 'detail info')
# export EVENTHUB_AUX_UNSTRUCTURED='ehunstruct'
# export EVENTHUB_AUX_STRUCTURED='ehstruct'
# export EVENTHUB_CONN_STRING='<connection string>'

from pyraisdk import rlog
rlog.initialize()

rlog.infocf('corrid', -1, 'this is a info message: %s', 123)
rlog.event('RequestDuration', '200', 0.01, 'this is duration in seconds')
from pyraisdk import rlog
rlog.initialize(eh_structured='ehstruct', eh_unstructured='ehunstruct', eh_conn_str='<eventhub-conn-str>')

rlog.errorcf('corrid', -1, Exception('error msg'), 'error message: %s %s', 1,2)
rlog.event('CpuUsage', '', 0.314, detail='cpu usage', corr_id='corrid', elem=-1)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyraisdk-0.3.8.tar.gz (35.1 kB view details)

Uploaded Source

Built Distribution

pyraisdk-0.3.8-py3-none-any.whl (38.5 kB view details)

Uploaded Python 3

File details

Details for the file pyraisdk-0.3.8.tar.gz.

File metadata

  • Download URL: pyraisdk-0.3.8.tar.gz
  • Upload date:
  • Size: 35.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.9.20 Linux/6.5.0-1025-azure

File hashes

Hashes for pyraisdk-0.3.8.tar.gz
Algorithm Hash digest
SHA256 6f237458f2d159b7999223f7138b2f4248817550f2ed22a94f04f3067400d424
MD5 4c0d9ba722718c30521d2e4804badcce
BLAKE2b-256 ee87db7e9232cf760e86ee9c2763b17df37da6377a10b41151f2e6eb2e7eb2b6

See more details on using hashes here.

File details

Details for the file pyraisdk-0.3.8-py3-none-any.whl.

File metadata

  • Download URL: pyraisdk-0.3.8-py3-none-any.whl
  • Upload date:
  • Size: 38.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.9.20 Linux/6.5.0-1025-azure

File hashes

Hashes for pyraisdk-0.3.8-py3-none-any.whl
Algorithm Hash digest
SHA256 e973fa7837512a936e8702f3409292a68cac6a8cb0a1e15a9514ae09eabd0fb1
MD5 7ed4136a8abf032b8816d712e633a71b
BLAKE2b-256 b77c94e53290792f3d3b16e2634949aa59fb28b725059c40e667c45bd548a5d0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page