CLI tool for managing MLPerf endpoint submissions
Project description
MLCommons Endpoints Submission Tools
A Python package with two tools for managing MLPerf Endpoints benchmark submissions:
endpoints-submission-cli— registers benchmark runs, assembles submission packages, runs compliance checks, and opens GitHub pull requests via the PRISM API.submission-checker— validates a submission folder against the §9.1 automated compliance rules before or after upload.
Installation
With pip:
pip install endpoints-submission-cli
From source (editable):
pip install -e ".[dev]"
With uv:
uv sync --extra dev
endpoints-submission-cli
Requirements
- Python 3.10 or later
ghCLI — required for creating, updating, and withdrawing submissions
Authentication
Every command requires a PRISM API token in mlc_… format. Supply it as an env var or pass --token per command:
# Persistent (add to shell profile)
export PRISM_USER_API_TOKEN=mlc_your_token_here
# Per-command override
endpoints-submission-cli runs list --token mlc_your_token_here
Submission commands that create or update GitHub pull requests also require the gh CLI:
gh auth login
Configuration
| Environment variable | Default | Description |
|---|---|---|
PRISM_USER_API_TOKEN |
— | API key. Required unless --token is passed. |
MLPERF_SUBMISSION_REPO |
MLCommons-Systems/test-endpoints-submission-repo |
Target GitHub repository for submission PRs (owner/repo). |
Add to your shell profile for a persistent setup:
export PRISM_USER_API_TOKEN=mlc_your_token_here
export MLPERF_SUBMISSION_REPO=MLCommons-Systems/endpoints-submission-repo
Quick start
# 1. Verify connectivity
endpoints-submission-cli runs list
# 2. Register a benchmark run from a local result folder
endpoints-submission-cli runs create --path /results/llama3_h100_c4
# → Run created: d5d9873e-5eca-4f8d-a487-4be1cb8b440c
RUN_ID=d5d9873e-5eca-4f8d-a487-4be1cb8b440c
# 3. Create a submission (assembles, checks, uploads, opens PR)
endpoints-submission-cli submissions create \
--division standardized \
--availability available \
--run-ids $RUN_ID
# → Submission created: a1b2c3d4-…
# → PR: https://github.com/MLCommons-Systems/…/pull/42
SUB_ID=a1b2c3d4-e5f6-7890-abcd-ef1234567890
# 4. Add another run later
endpoints-submission-cli submissions add-run \
--submission-id $SUB_ID \
--run-id <new-run-id>
# 5. Withdraw if needed
endpoints-submission-cli submissions withdraw --submission-id $SUB_ID
Command reference
endpoints-submission-cli
├── runs
│ ├── list List all runs
│ ├── create Register a run from a local folder
│ ├── get Fetch run details
│ ├── delete Delete a run and its archive
│ ├── pin Pin a run (prevent expiry)
│ └── unpin Restore normal expiry
└── submissions
├── list List all submissions
├── create Create a submission from runs (full pipeline)
├── get Fetch submission details
├── update Update run list or metadata
├── withdraw Withdraw a submission
├── add-run Add a run to an existing submission
└── remove-run Remove a run from a submission
Use --help on any command for full flag details:
endpoints-submission-cli submissions create --help
submission-checker
CLI tool for validating MLPerf Endpoints submissions against the §9.1 automated compliance checks.
Usage
Check a submission
submission-checker check /path/to/submission
The tool expects the submission root to contain systems/ and pareto/ subdirectories as specified in §8.1.
Options:
| Flag | Description |
|---|---|
--strict |
Treat warnings as errors (exit 1 on any warning) |
--quiet / -q |
Suppress INFO-level passing checks |
--output FILE / -o FILE |
Write full results as JSON to FILE |
Exit codes: 0 = all checks passed, 1 = one or more errors (or warnings with --strict).
Show region boundaries
submission-checker regions --max-concurrency 1024
Prints the concurrency ranges for each region given a declared Maximum Supported Concurrency M (§5.5).
Required Files in submission structure
<org>/
├── systems/
│ └── <system_desc_id>.json # §8.2 — hardware + software description
└── pareto/
└── <system_desc_id>/
└── <benchmark_model>/
├── points/
│ └── point_<N>.yaml # §8.3 — one config per measurement point
├── results/
│ └── point_<N>/
│ ├── mlperf_endpoints_log_summary.json
│ └── mlperf_endpoints_log_detail.json
└── accuracy/
├── accuracy.txt
└── accuracy_result.json
What gets checked
| Rule | Spec | Description |
|---|---|---|
path-exists |
§1 | Submission root directory exists |
required-dir |
§1 | systems/ and pareto/ present |
system-description-present |
§1 | At least one *.json file found in systems/ |
system-description-valid |
§1 | systems/*.json parses against schema |
src-dir |
§1 | src/ present for Standardized submissions |
pareto-dir-exists |
§1 | pareto/<system_id>/ directory exists |
benchmark-model-dir |
§1 | At least one benchmark-model directory in pareto/<system_id>/ |
pareto-subdir |
§1 | points/, results/, accuracy/ present |
measurement-points-present |
§1 | At least one point_*.yaml found |
point-config-valid |
§1 | YAML parses against PointConfig schema |
point-filename-concurrency |
§1 | Filename concurrency matches declared value |
result-file-present |
§1 | Result summary log exists for each point config |
result-detail-present |
§1 | Result detail log exists for each point config |
result-file-valid |
§1 | Result summary log parses against PointSummary schema |
point-count |
§2, §8 | 7–32 measurement points |
point-cap |
§2, §8 | Point count does not exceed 32 |
low-latency-coverage |
§3 | At least one point in Low Latency region |
low-throughput-coverage |
§4 | At least one point in Low Throughput region |
med-throughput-coverage |
§5 | At least one point in Medium Throughput region |
high-throughput-coverage |
§6 | At least one point in High Throughput region |
max-concurrency-declared |
§7 | max_supported_concurrency field present |
region-computation |
§7 | M > 32 (required for region formula) |
concurrency-in-range |
§9 | Concurrency within region bounds (incl. 10% margin) |
load-pattern |
§10 | load_pattern is concurrency with a positive concurrency level |
point-duration |
§11 | Point meets per-region minimum duration |
min-query-count |
§12 | n_samples_completed meets dataset-specific minimum (§6.4) |
streaming-config |
§13 | stream_all_chunks is True |
metric-consistency-duration |
§14 | duration_ns > 0 |
metric-consistency-accounting |
§14 | completed + failed == issued |
metric-consistency-output-tokens |
§14 | total_output_tokens ≥ 0 |
metric-consistency-system-tps |
§9.1 | Stored system_tps consistent with derived value |
metric-consistency-tps-per-user |
§9.1 | Stored tps_per_user consistent with system_tps / concurrency |
accuracy-file |
§15 | accuracy.txt and accuracy_result.json present |
accuracy-valid |
§15 | accuracy_result.json parses correctly |
accuracy-consistency |
§15 | passed flag consistent with score >= quality_target |
accuracy-gate |
§15 | Score ≥ quality target |
config-consistency-dataset |
§16 | All points use the same dataset |
config-consistency-model |
§16 | Directory name matches benchmark_model |
region-declared |
§8.3 | Declared region field (if present) is valid and matches computed region |
Programmatic API
from submission_checker import SubmissionChecker, Report
checker = SubmissionChecker(Path("/submissions/acme_corp"))
report = checker.run()
if report.passed:
print("All checks passed")
else:
for result in report.errors:
print(f"[{result.rule}] {result.message}")
The Report object also exposes report.warnings and serialises cleanly via report.model_dump_json().
Development
uv run pytest # run all tests
uv run pytest --no-cov -x # fast fail on first error
uv run ruff check src/ tests/ # lint
uv run ruff format src/ tests/ # auto-format
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file endpoints_submission_cli-0.1.1.10.tar.gz.
File metadata
- Download URL: endpoints_submission_cli-0.1.1.10.tar.gz
- Upload date:
- Size: 219.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a550b7d0c2065a7aae86d855a6d832a52dc2f420b34641cfb1621cfa1bd37740
|
|
| MD5 |
610403aea0a2bca6aa531e4c74693810
|
|
| BLAKE2b-256 |
1363ecc7c9cbc53b4a6c667581dd0f23d26ea7e967f08e12b97d7ff3d6937ca2
|
Provenance
The following attestation bundles were made for endpoints_submission_cli-0.1.1.10.tar.gz:
Publisher:
publish.yml on mlcommons/endpoints-submission-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
endpoints_submission_cli-0.1.1.10.tar.gz -
Subject digest:
a550b7d0c2065a7aae86d855a6d832a52dc2f420b34641cfb1621cfa1bd37740 - Sigstore transparency entry: 1839844934
- Sigstore integration time:
-
Permalink:
mlcommons/endpoints-submission-cli@589a5544998911c3859a8623076e8dd6c075cfae -
Branch / Tag:
refs/tags/v0.1.1.10 - Owner: https://github.com/mlcommons
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@589a5544998911c3859a8623076e8dd6c075cfae -
Trigger Event:
release
-
Statement type:
File details
Details for the file endpoints_submission_cli-0.1.1.10-py3-none-any.whl.
File metadata
- Download URL: endpoints_submission_cli-0.1.1.10-py3-none-any.whl
- Upload date:
- Size: 80.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8871119d4bfe9a2d9c98885b5b718acb76a3e99b506cf42f7261c9a2df539232
|
|
| MD5 |
e71a3767dabbd79e1af4c186a2455e84
|
|
| BLAKE2b-256 |
6915b474984bb175cec9c3e6cd419c0d6fff1613b6cf349dc7aaa5042cc73afd
|
Provenance
The following attestation bundles were made for endpoints_submission_cli-0.1.1.10-py3-none-any.whl:
Publisher:
publish.yml on mlcommons/endpoints-submission-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
endpoints_submission_cli-0.1.1.10-py3-none-any.whl -
Subject digest:
8871119d4bfe9a2d9c98885b5b718acb76a3e99b506cf42f7261c9a2df539232 - Sigstore transparency entry: 1839844955
- Sigstore integration time:
-
Permalink:
mlcommons/endpoints-submission-cli@589a5544998911c3859a8623076e8dd6c075cfae -
Branch / Tag:
refs/tags/v0.1.1.10 - Owner: https://github.com/mlcommons
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@589a5544998911c3859a8623076e8dd6c075cfae -
Trigger Event:
release
-
Statement type: