Skip to main content

MLPerf Inference LoadGen python bindings

Project description

Overview {#mainpage}

Note: A compiled html version of this document is hosted online here.

Introduction

  • The LoadGen is a reusable module that efficiently and fairly measures the performance of inference systems.
  • It generates traffic for scenarios as formulated by a diverse set of experts in the MLPerf working group.
  • The scenarios emulate the workloads seen in mobile devices, autonomous vehicles, robotics, and cloud-based setups.
  • Although the LoadGen is not model or dataset aware, its strength is in its reusability with logic that is.

Integration Example and Flow

The following is an diagram of how the LoadGen can be integrated into an inference system, resembling how some of the MLPerf reference models are implemented.

  1. Benchmark knows the model, dataset, and preprocessing.
  2. Benchmark hands dataset sample IDs to LoadGen.
  3. LoadGen starts generating queries of sample IDs.
  4. Benchmark creates requests to backend.
  5. Result is post processed and forwarded to LoadGen.
  6. LoadGen outputs logs for analysis.

Useful Links

  • [FAQ](@ref ReadmeFAQ)
  • [LoadGen Build Instructions](@ref ReadmeBuild)
  • [LoadGen API](@ref LoadgenAPI)
  • [Test Settings](@ref LoadgenAPITestSettings) - A good description of available scenarios, modes, and knobs.
  • MLPerf Inference Code - Includes source for the LoadGen and reference models that use the LoadGen.
  • MLPerf Inference Rules - Any mismatch with this is a bug in the LoadGen.
  • MLPerf Website

Scope of the LoadGen's Responsibilities

In Scope

  • Provide a reusable C++ library with python bindings.
  • Implement the traffic patterns of the MLPerf Inference scenarios and modes.
  • Record all traffic generated and received for later analysis and verification.
  • Summarize the results and whether performance constraints were met.
  • Target high-performance systems with efficient multi-thread friendly logging utilities.
  • Generate trust via a shared, well-tested, and community-hardened code base.

Out of Scope

The LoadGen is:

  • NOT aware of the ML model it is running against.
  • NOT aware of the data formats of the model's inputs and outputs.
  • NOT aware of how to score the accuracy of a model's outputs.
  • NOT aware of MLPerf rules regarding scenario-specific constraints.

Limitting the scope of the LoadGen in this way keeps it reusable across different models and datasets without modification. Using composition and dependency injection, the user can define their own model, datasets, and metrics.

Additionally, not hardcoding MLPerf-specific test constraints, like test duration and performance targets, allows users to use the LoadGen unmodified for custom testing and continuous integration purposes.

Submission Considerations

Upstream all local modifications

  • As a rule, no local modifications to the LoadGen's C++ library are allowed for submission.
  • Please upstream early and often to keep the playing field level.

Choose your TestSettings carefully!

  • Since the LoadGen is oblivious to the model, it can't enforce the MLPerf requirements for submission. e.g.: target percentiles and latencies.
  • For verification, the values in TestSettings are logged.
  • To help make sure your settings are spec compliant, use TestSettings::FromConfig in conjunction with the relevant config file provided with the reference models.

Responsibilities of a LoadGen User

Implement the Interfaces

  • Implement the SystemUnderTest and QuerySampleLibrary interfaces and pass them to the StartTest function.
  • Call QuerySampleComplete for every sample received by SystemUnderTest::IssueQuery.

Assess Accuracy

  • Process the mlperf_log_accuracy.json output by the LoadGen to determine the accuracy of your system.
  • For the official models, Python scripts will be provided by the MLPerf model owners for you to do this automatically.

For templates of how to do the above in detail, refer to code for the demos, tests, and reference models.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

mlperf_loadgen_cb-1.1.0-cp37-cp37m-win_amd64.whl (4.0 MB view details)

Uploaded CPython 3.7m Windows x86-64

File details

Details for the file mlperf_loadgen_cb-1.1.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: mlperf_loadgen_cb-1.1.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 4.0 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.51.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.9

File hashes

Hashes for mlperf_loadgen_cb-1.1.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 2477c72d5147c742a48cb989da2a1a3a3340db7d030a682275dd0e8494e0ee56
MD5 d06c5440299d47162e7c30e361572b72
BLAKE2b-256 28a0723ad3dc67da7676627d5e057fae3357561b335530961e0fd0ee90fccaa0

See more details on using hashes here.

File details

Details for the file mlperf_loadgen_cb-1.1.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: mlperf_loadgen_cb-1.1.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 4.0 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.51.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.9

File hashes

Hashes for mlperf_loadgen_cb-1.1.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 fe004c755b17e4388fdae7a4b8307cb16f6952b5302c1a506f0f4780318c106b
MD5 e65f66539f6c6f7d1cb7a2d773cb993d
BLAKE2b-256 cce5480a9e2a7dfa2227ebe3ec36d357ef83633bd52746724d803971435c698c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page