Common Unified Benchmark Environments
Project description
CUBE Standard
[!NOTE] CUBE is in active development (alpha). Interfaces may change. We welcome early adopters and contributors who want to shape the standard, not just use it. See our Roadmap and Contributing Guide. Serious contributors can apply here to become part of the team.
This repo contains the code and documentation for the AI Alliance: CUBE Standard project, which standardizes benchmark wrapping so the community can wrap otherwise-incompatible benchmarks uniformly and use them everywhere.
CUBE Standard defines the protocol — the Tool, Task, Benchmark, Observation, and Action interfaces that any benchmark must implement. cube-harness is the evaluation runtime that runs agents against CUBE-compatible benchmarks.
Paper: arXiv:2603.15798
Principal developer: ServiceNow AI Research.
Installation
Requires Python 3.12+. Install with uv:
uv add cube-standard
Or with pip:
pip install cube-standard
To include optional container backends:
# Docker support
uv add "cube-standard[docker]"
# Modal support
uv add "cube-standard[modal]"
# Daytona support
uv add "cube-standard[daytona]"
For development (includes test and lint tools):
git clone https://github.com/The-AI-Alliance/cube-standard
cd cube-standard
uv sync --extra dev
CLI commands
| Command | What it does |
|---|---|
cube init [NAME] |
Scaffolds a new benchmark package from the built-in template |
cube list |
Lists all installed benchmarks registered under cube.benchmarks entry points |
cube test NAME |
Runs the debug suite and asserts reward == 1.0 on every debug task |
For benchmark contributors
Fast path — copy the reference implementation, rename, and iterate:
cp -r examples/counter-cube my-bench
cd my-bench && uv sync
# Edit @tool_action decorated methods in src/*/tool.py
# Edit reset() and evaluate() in src/*/task.py
# Edit benchmark_metadata, task_metadata, task_config_class, _setup() and close() in src/*/benchmark.py
# expose get_debug_benchmark and get_debug_agent in src/*/debug.py
cube test my-bench
Or scaffold from the template:
cube init my-bench # scaffold a new benchmark package from the template
cd my-bench
uv sync
cube test my-bench # run the debug compliance suite
See CONTRIBUTING.md for the five-layer architecture and implementation order.
Getting Involved
All contributions are welcome — open an issue, submit a PR, or wrap a new benchmark. See CONTRIBUTING.md for the development guide and RFC process.
Want to contribute a benchmark? Whether you're an original author or just a frequent user, fill out this short form to let us know. No commitment required — we'll follow up based on your interest and the benchmark's fit.
Want deeper involvement? Join the core team, shape the roadmap, and get credit for what you build. Apply here.
For general AI Alliance contribution guidelines, see the community repo and Code of Conduct.
All code contributions are licensed under the Apache 2.0 LICENSE (which is also in this repo, LICENSE.Apache-2.0).
All documentation contributions are licensed under the Creative Commons Attribution 4.0 International (which is also in this repo, LICENSE.CC-BY-4.0).
All data contributions are licensed under the Community Data License Agreement - Permissive - Version 2.0 (which is also in this repo, LICENSE.CDLA-2.0).
We use the "Developer Certificate of Origin" (DCO).
[!WARNING] Before you make any git commits with changes, understand what's required for DCO.
See the Alliance contributing guide section on DCO for details. In practical terms, supporting this requirement means you must use the -s flag with your git commit commands.
Pre-commit hooks (recommended)
This repo uses the pre-commit framework to run fast checks locally before you commit, including enforcing the DCO Signed-off-by line.
Install the hooks (you only need to do this once per clone):
pre-commit install --hook-type pre-commit --hook-type commit-msg
Run the checks on all files (optional, useful the first time):
pre-commit run --all-files
When committing, include your sign-off:
git commit -s -m "your message"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cube_standard-0.1.0rc4.tar.gz.
File metadata
- Download URL: cube_standard-0.1.0rc4.tar.gz
- Upload date:
- Size: 56.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab38a1d881b545b645b92503b947f945bcff4930e522934411fd87b10d08bd03
|
|
| MD5 |
2a8ddc557e1a3e70bc8e54b7d53306fc
|
|
| BLAKE2b-256 |
c6455ba997b63cddba438ec92aaa6f585eb51521871dc3147ed19a695b24ce69
|
Provenance
The following attestation bundles were made for cube_standard-0.1.0rc4.tar.gz:
Publisher:
release.yml on The-AI-Alliance/cube-standard
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cube_standard-0.1.0rc4.tar.gz -
Subject digest:
ab38a1d881b545b645b92503b947f945bcff4930e522934411fd87b10d08bd03 - Sigstore transparency entry: 1185818377
- Sigstore integration time:
-
Permalink:
The-AI-Alliance/cube-standard@ee172c40c7b402bdfda77422b6811b0a24c93ee1 -
Branch / Tag:
refs/tags/cube-standard/v0.1.0rc4 - Owner: https://github.com/The-AI-Alliance
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@ee172c40c7b402bdfda77422b6811b0a24c93ee1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file cube_standard-0.1.0rc4-py3-none-any.whl.
File metadata
- Download URL: cube_standard-0.1.0rc4-py3-none-any.whl
- Upload date:
- Size: 73.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c198b6bade21842b2da281af077202974b2cb809e30e35adfc5bca98a45661c
|
|
| MD5 |
535119904fb4ed9a1f3d6a213d886036
|
|
| BLAKE2b-256 |
66c21f5385ae569088d8ce596e418a9b2ee1a254d427a53887d2a12e02f61f07
|
Provenance
The following attestation bundles were made for cube_standard-0.1.0rc4-py3-none-any.whl:
Publisher:
release.yml on The-AI-Alliance/cube-standard
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cube_standard-0.1.0rc4-py3-none-any.whl -
Subject digest:
9c198b6bade21842b2da281af077202974b2cb809e30e35adfc5bca98a45661c - Sigstore transparency entry: 1185818379
- Sigstore integration time:
-
Permalink:
The-AI-Alliance/cube-standard@ee172c40c7b402bdfda77422b6811b0a24c93ee1 -
Branch / Tag:
refs/tags/cube-standard/v0.1.0rc4 - Owner: https://github.com/The-AI-Alliance
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@ee172c40c7b402bdfda77422b6811b0a24c93ee1 -
Trigger Event:
push
-
Statement type: