concerto-multirobot

Heterogeneous Multi-Robot Ad-Hoc Teamwork — benchmark + safety stack.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Contact-rich coordination with opaque, heterogeneous teammates — with explicit safety assumptions and conformal-CBF reporting.
CONCERTO is the method. CHAMBER is the benchmark. We evaluate CONCERTO on CHAMBER.

Status

Status — pre-release, Phase 0. Architecture is locked in 15 ADRs (13 Accepted, 2 RFC) under the working policy recorded in adr/ADR-INDEX.md; the staged Phase-0 spike protocol (ADR-007) is the validation gate that promotes Accepted ADRs to Validated with per-axis ≥20 pp evidence. The Stage-1 (AS + OM) preregistrations are the next launch; the leaderboard fills with M5. The public API is on 0.x — MINOR bumps may break it per SemVer §4. See Roadmap.

TL;DR. CONCERTO is a three-layer safety stack — exponential CBF‑QP, conformal-slack overlay, OSCBF inner filter, hard braking fallback — for robots that must work with opaque, heterogeneous teammates they were never trained with. CHAMBER is the matching benchmark — six heterogeneity sub-axes above ManiSkill v3, fixed-format communication with URLLC-anchored degradation profiles, a partner zoo, and an ISO 10218-2:2025-aware safety-reporting format. Open from day one, ADR-tracked design contract, preregistered spikes, byte-identical CPU determinism via uv.lock + a root_seed.

Table of contents

Quickstart
Architecture at a glance
Why this exists
The six heterogeneity axes CHAMBER measures
Repository layout
Leaderboard
Who this is for
Documentation
Roadmap
FAQ
Non-goals
Contributing
Stability & versioning
Citing CONCERTO & CHAMBER
Acknowledgments
License

Quickstart

_{30-second smoke test.}

git clone https://github.com/fsafaei/concerto.git
cd concerto
pip install uv && uv sync --group dev --group train

# Smoke test the rig (ADR-001 acceptance criterion).
uv run pytest -m smoke -x -v

Install groups. --group dev pulls the developer toolchain (ruff, pyright, pytest). HARL ships as the harl-aht distribution (the CONCERTO fork at fsafaei/harl-fork; see ADR-002 §Revision-history 2026-05-19 and #132) and is pulled automatically as a runtime dependency, so the ego-AHT trainer + frozen-HARL partner work out of the box from a source checkout — no separate train-group install needed. The concerto-multirobot distribution ships to TestPyPI for the 0.x line; the production-PyPI debut is staged in the Release workflow.

Compose a factory-floor channel (URLLC-anchored degradation profile from ADR-006) and round-trip a packet through encode → decode:

from chamber.comm import (
    CommDegradationWrapper,
    FixedFormatCommChannel,
    URLLC_3GPP_R17,
)

channel = CommDegradationWrapper(
    FixedFormatCommChannel(),
    URLLC_3GPP_R17["factory"],
    tick_period_ms=1.0,
    root_seed=0,
)

state = {
    "pose": {
        "ego": {"xyz": (0.0, 0.0, 0.0), "quat_wxyz": (1.0, 0.0, 0.0, 0.0)},
    },
    "task_state": {"ego": {"grasp_side": "left"}},
}

# The factory profile delays each packet by ~5 ticks; drain the queue so
# the visible packet carries the freshly-encoded state.
for _ in range(10):
    packet = channel.encode(state)
decoded = channel.decode(packet)
print("decoded payload:", decoded)

Save the snippet to quickstart.py and run uv run python quickstart.py.

The six pre-registered URLLC profiles — ideal, urllc, factory, wifi, lossy, saturation — are the Stage‑2 CM sweep table. See docs/how-to/run-spike.md for the full flow. For the bigger picture, jump to Architecture at a glance.

Architecture at a glance

Two top-level packages, one wheel. CHAMBER (benchmark) wraps ManiSkill v3 and provides the six heterogeneity axes, the communication stack, the partner zoo, and the evaluation harness. CONCERTO (method) provides the safety stack and the ego-AHT training loop. Dependency direction is one-way: chamber → concerto.

flowchart LR
    subgraph CHAMBER["CHAMBER · benchmark"]
        direction TB
        ENVS["envs<br/>ManiSkill v3 wrappers<br/>AS · OM · CR"]
        COMM["comm<br/>fixed-format channel +<br/>URLLC degradation · CM"]
        PART["partners<br/>partner zoo · PF<br/>heuristic / frozen-RL / VLA"]
        EVAL["evaluation<br/>HRS · prereg · leaderboard"]
        BENCH["benchmarks<br/>Stage-0/1/2/3 spike runners"]
    end
    subgraph CONCERTO["CONCERTO · method"]
        direction TB
        SAFETY["safety<br/>exp CBF-QP + conformal +<br/>OSCBF + braking · SA"]
        TRAIN["training<br/>ego-AHT loop +<br/>deterministic seeding"]
        API["api<br/>public Protocols"]
    end
    ENVS --> BENCH
    COMM --> BENCH
    PART --> BENCH
    BENCH --> EVAL
    BENCH -- ego-policy --> TRAIN
    TRAIN -- filtered actions --> SAFETY
    SAFETY -. consumes .-> API
    CHAMBER -- "depends on (one-way)" --> CONCERTO

The six axis labels in parentheses tie each module to the heterogeneity sub-axis it exercises; see the six heterogeneity axes for the per-axis pre-registered ≥20 pp gap rule.

Why this exists

Real factories already pair robots that were never trained together. A 500 Hz industrial arm next to a 50 Hz mobile base; a vision-only manipulator next to a force-feedback one; a vendor‑A controller next to a vendor‑B controller under binding ISO 10218-2:2025. At deployment time, your robot's teammate is opaque (no policy access), heterogeneous (different morphology and action frequency), and ad hoc (no prior joint training). Hospitals and warehouses are the same picture.

Most multi-robot benchmarks assume identical embodiments and shared training. The few that don't focus on planning or navigation, not on contact-rich physical manipulation. The intersection of Heterogeneity × Black-box partner × Safety × Manipulation is empty in the published literature. CHAMBER is built to fill it, and CONCERTO is the first method designed against this four-aspect contract; empirical validation is staged through CHAMBER spikes (Stage 1 → Stage 3) per ADR-007 §Decision.

How we sit relative to the closest prior work

Every prior precedent covers at most three of the four aspects. The table below lists the closest precedent for each pair of aspects; no published row hits all four. Click any precedent to open the paper.

Method	Heterogeneous	Black-box partner	Safety bound	Contact-rich manipulation
Liu 2024 RSS (LLM‑AHT)	✓	✓
COHERENT (LLM‑MR planning)	✓	✓
Huriot & Sibai 2025 (conformal CBF)		✓	✓
HetGPPO / HARL (heterogeneous MARL)	✓
Wang et al. 2017 (multi‑robot CBFs)	✓		✓
RoCoBench (multi‑robot manipulation)	✓			✓
SafeBimanual (safe bimanual manip.)			✓	✓
CONCERTO + CHAMBER	✓	✓	✓	✓

Reading the table. Heterogeneous here is the four-aspect literature-gap level; CHAMBER's six measurable sub-axes (AS, OM, CR, CM, PF, SA) decompose it further per ADR-007.

Read the table by columns to see what each aspect covers in isolation, and by rows to see what no single line of work has yet combined. Contact-rich manipulation appears with multi-robot coordination (RoCoBench) and with safety (SafeBimanual), but never with black-box ad-hoc partners under explicit safety assumptions at the same time. CONCERTO + CHAMBER occupy the four-aspect intersection at the design-contract level (ADRs, scaffold, smoke test); empirical validation across the six heterogeneity sub-axes is the staged Phase-0 spike protocol's job (Stage 1: AS + OM → Stage 2: CR + CM → Stage 3: PF + SA), with results landing on the leaderboard from M5 onward.

See adr/ADR-007 for the six-axis taxonomy that defines "heterogeneous" precisely, and the docs/explanation/why-aht.md page for the long-form positioning.

The six heterogeneity axes CHAMBER measures

Axis	Symbol	What it varies	Where the priors come from
Action space	AS	7‑DOF arm vs 2‑DOF mobile base on shared task	HARL, HetGPPO
Observation modality	OM	vision-only vs vision + force/torque + proprioception	Visual-tactile peg-in-hole literature
Control rate	CR	500 Hz arm vs 50 Hz base, chunk size held constant	RTC, A2C2, FAVLA
Communication	CM	latency 1–100 ms, jitter µs–10 ms, drop 10⁻⁶–10⁻²	3GPP R17, URLLC
Partner familiarity	PF	trained-with vs frozen-novel partner, mid-episode swap	FCP, MEP
Safety	SA	mixed-vendor force-limit / SIL-PL pairs, contact-rich	ISO 10218-2:2025

Every surviving axis is required to clear a pre-registered ≥20 pp homogeneous-vs-heterogeneous gap before it ships in the v1 benchmark. See adr/ADR-007 for the staged Phase‑0 spike protocol (Stage 1: AS + OM, Stage 2: CR + CM, Stage 3: PF + SA).

Repository layout

src/
├── concerto/      # the METHOD  (cite this)
│   ├── safety/    #   exp CBF-QP + conformal overlay + OSCBF + braking fallback
│   ├── training/  #   ego-AHT training loop + deterministic seeding
│   ├── policies/  #   Phase-1 trained checkpoints
│   └── api/       #   public Protocols
└── chamber/       # the BENCHMARK  (run this)
    ├── envs/      #   ManiSkill v3 wrappers
    ├── comm/      #   fixed-format channel + URLLC degradation
    ├── partners/  #   partner zoo (heuristic / frozen-RL / VLA stubs)
    ├── tasks/     #   CHAMBER-Solo / Duo / Quartet (Phase 1+)
    ├── evaluation/#   HRS, pre-registration, leaderboard renderer
    └── benchmarks/#   Stage-0/1/2/3 spike runners

adr/               # 15 Architecture Decision Records (the design rationale)
docs/              # Diátaxis: tutorials / how-to / reference / explanation
tests/             # unit / property / integration / smoke / reproduction
spikes/            # pre-registration YAMLs + result archives

Leaderboard

_{Stage‑0 acceptance results; rendered by
chamber-render-tables after each tagged spike.
Stage 1 (AS + OM) rows land with M5 — see
Roadmap.}

Show placeholder table

Method	Stage 0 success	Inter-robot collision	Force-limit violation	Conformal λ mean	Reference
MAPPO (homogeneous baseline)	pending	pending	pending	n/a	M5
HetGPPO + naive CBF	pending	pending	pending	n/a	M5
CONCERTO	pending	pending	pending	pending	M5

Submit a new entry: docs/how-to/submit-leaderboard.md.

Who this is for

Multi-robot RL researchers — CHAMBER is the first benchmark to score ad-hoc teamwork at the manipulation tier with a measurable heterogeneity-robustness score (HRS). Start with docs/tutorials/hello-spike.md.

Safe-control researchers — CONCERTO's safety module is a production-grade reference implementation of the exp CBF + conformal + OSCBF stack with a hard braking fallback. The unresolved theoretical question (average-loss → per-step bound) is documented in adr/ADR-004.

Robotics practitioners and integrators — CHAMBER's communication profiles are anchored to 3GPP Release 17 URLLC and 5G-TSN industrial-trial data, and the safety axis references ISO 10218-2:2025 directly. See docs/explanation/threat-model.md.

Documentation

Full documentation: fsafaei.github.io/concerto

Tutorials — step-by-step walkthroughs.
How-tos — add a partner, add a safety filter, run a spike.
API reference — generated from docstrings.
ADR index — 15 design decisions with full rationale.
Glossary — HRS, AoI, OSCBF, FCP/MEP, all defined.
Literature — five-cluster bibliography (AHT/ZSC, safe control, conformal prediction, benchmarks, reproducibility).
Standards — ISO 10218-2:2025 + IEC 62061 + IEEE TSN + 3GPP R17 references, with the axis → standard → metric → report-table flowchart.
Evaluation — the multi-seed and rliable reporting contract for the leaderboard.

Roadmap

The project advances in three phases. Phase 0 (current) locks the design contract and runs the staged heterogeneity-axis spikes. Phase 1 ships the partner zoo and the populated leaderboard. Phase 2 expands tasks and adds the real-robot demo platform.

Now — Phase 0, design contract live, spikes about to start. 15 ADRs (13 Accepted, 2 RFC) under the status taxonomy in adr/ADR-INDEX.md; open follow-up work is tracked per-ADR via the footnote column. M1 (platform), M2 (comm), and M4b (training stack) are merged on main. The chamber-spike CLI runs the ego-AHT loop end-to-end against a Hydra config.

Next. Stage-1 spikes (AS + OM) — preregistered, launched, first leaderboard rows. arXiv design-report preprint (priority defence on the four-aspect framing). Stage-2 spikes (CR + CM).

Later. Stage-3 spikes (PF + SA) — possibly HIL for SA. Phase-1 leaderboard v1 (CONCERTO + 3 baselines on Tier-1 / Tier-2 tasks). Phase 2 (Tier-3 long-object tasks, real-robot demo platform).

Day-to-day progress: CHANGELOG.md and the issues board.

FAQ

How does CHAMBER differ from RoCoBench, SafeBimanual, or BiGym?

RoCoBench covers Heterogeneity × Manipulation on MuJoCo with multi-arm LLM-dialectic coordination but does not address black-box partners or formal safety bounds. SafeBimanual covers Safety × Manipulation on a single bimanual platform. BiGym is single-embodiment. CHAMBER targets the four-aspect intersection (H × B × S × M) at the substrate level — thin wrapper layers above ManiSkill v3, a fixed-format communication stack, and a partner zoo — rather than as a curated task set. See ADR-001 and ADR-005 for the simulator-base decision.

Is the safety guarantee per-step or asymptotic?

The conformal slack overlay (Huriot & Sibai 2025 Theorem 3) gives a distribution-free ε + o(1) long-term average-loss bound, not a per-step bound. For contact-rich manipulation where a single violation can be irreversible, the hard braking fallback (Wang‑Ames‑Egerstedt 2017 eq. 17) is the per-step backstop. Sharpening the average-loss bound to per-step is the project's headline open theoretical question; see ADR-004 Open Questions. The conformal layer's average-loss bound is an Accepted claim under the ADR status taxonomy with the per-step refinement flagged as Open work in ADR-004 (see also ADR-INDEX footnote a); promoting the layer to Validated is gated on the Stage-1 AS spike and the follow-up safety-stack refactor.

Can I plug in my own partner or safety filter?

Yes. Partners implement the FrozenPartner Protocol in chamber.partners.api; register with the @register_partner decorator from chamber.partners. See Add a partner. Safety filters implement the SafetyFilter Protocol in concerto.safety.api; see Add a filter.

When will the leaderboard be populated?

Stage-1 (AS + OM) rows land with M5. The remaining rows fill as the staged spikes complete; see Roadmap.

What's the relationship between CONCERTO and CHAMBER?

CONCERTO is the method (safety stack + ego-AHT training); CHAMBER is the benchmark (env wrappers + comm + partner zoo + evaluation). Two top-level packages in one wheel, with a one-way dependency: chamber → concerto. Canonical sentence: we evaluate CONCERTO on CHAMBER.

Is this reproducible bit-for-bit?

CPU runs are byte-identical under uv.lock + a root_seed via the determinism harness in concerto.training.seeding. GPU runs are deterministic up to the underlying CUDA non-determinism in PyTorch reductions; the rliable-style aggregate metrics defined in docs/reference/evaluation are the canonical way to compare across seeds.

Why ManiSkill v3 and not Isaac Lab?

ADR-001's contingent rule was "extend the simulator if its abstractions admit the heterogeneity-axis controls without monkey-patching." ManiSkill v3 passes that test at ≈230 LOC of wrappers; Isaac Lab would have required a 3-month standalone build. Isaac Lab remains a viable secondary path if upstream API constraints force a migration — the env-adapter layer is intentionally thin so the swap is Type-2 reversible. See ADR-001 and ADR-005.

Non-goals

CHAMBER is not a navigation, planning, or generic-RL benchmark; the four-aspect intersection requires contact-rich physical manipulation. CONCERTO is not a certified safety product — it is a research-grade reference implementation of the exp CBF + conformal + OSCBF stack and is not a substitute for safety engineering in production deployments. The project does not ship pretrained partner checkpoints in Phase 0; the partner zoo construction lands in Phase 1.

Contributing

This is a research project, but it is open from the first commit. We welcome PRs.

Read CONTRIBUTING.md for the development flow.
Look at issues labelled good-first-issue.
Sign your commits (-S). DCO (Signed-off-by:) is required.
External contributors: the CLA bot will guide you on first PR.
Every PR cites the ADR section it touches (e.g. ADR-004 §6.2). We treat the ADRs as the design contract; if your PR motivates a change to them, propose a new ADR rather than editing an Accepted one.

Code of Conduct: CODE_OF_CONDUCT.md. Security policy: SECURITY.md.

Stability & versioning

This project follows Semantic Versioning. Under 0.x, MINOR-version bumps may break the public API per SemVer §4. The public API surfaces are concerto.api, concerto.safety.api, and chamber.comm; everything else is implementation detail and subject to change without notice. The wire-format chamber.comm.SCHEMA_VERSION constant is the single source of truth for the fixed-format packet shape; bumping it is a breaking change and requires a new ADR.

Citing CONCERTO & CHAMBER

If you use CONCERTO or CHAMBER in your research, please cite the preprint. Until the preprint is on arXiv (target: 2026‑06), cite the archived software release via its Zenodo DOI:

@software{safaei2026concerto,
  author       = {Safaei, Farhad},
  title        = {{CONCERTO} and {CHAMBER}: Contact-rich Coordination
                  with Opaque, Heterogeneous Teammates},
  year         = {2026},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.20128469},
  url          = {https://doi.org/10.5281/zenodo.20128469},
  note         = {arXiv preprint forthcoming},
}

Citation entries are also in CITATION.cff so GitHub renders a "Cite this repository" button.

Acknowledgments

CONCERTO stands on shoulders. The safety stack composes Wang et al. 2017 (decentralised exponential CBFs), Huriot & Sibai 2025 (conformal CBFs), and Morton & Pavone 2025 (OSCBF). CHAMBER is a wrapper layer over ManiSkill v3 and depends on a fork of HARL for the training stack. Corrections, acknowledgements, and contributions are welcome via PRs and issues.

License

Apache 2.0. See LICENSE and NOTICE. The full Software Bill of Materials is at sbom.spdx.json and is regenerated on every release.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

fsafaei

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.7.0

Jun 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

concerto_multirobot-0.7.0.tar.gz (975.7 kB view details)

Uploaded Jun 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

concerto_multirobot-0.7.0-py3-none-any.whl (277.4 kB view details)

Uploaded Jun 11, 2026 Python 3

File details

Details for the file concerto_multirobot-0.7.0.tar.gz.

File metadata

Download URL: concerto_multirobot-0.7.0.tar.gz
Upload date: Jun 11, 2026
Size: 975.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for concerto_multirobot-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`835984f168461343d3b94fbadf20ca9685e98220a6988ecea3c5cfdd951fd47a`
MD5	`906d7fd651794cf7415f60ae69831133`
BLAKE2b-256	`a22eaf0bfec2d574d06fe4cb152473391d3ed70f9aa65c6405d9e03a9d34f98f`

See more details on using hashes here.

File details

Details for the file concerto_multirobot-0.7.0-py3-none-any.whl.

File metadata

Download URL: concerto_multirobot-0.7.0-py3-none-any.whl
Upload date: Jun 11, 2026
Size: 277.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for concerto_multirobot-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b2f23d4ae2bb770fb4b1c8e5c77c13686788418f9d2537a6890c3e01b9d0d1d9`
MD5	`0d3e0472c03b1073b930f79aacaca178`
BLAKE2b-256	`9cb821889ef56c6b7f15095284066c763008fe7ecded345e3d686fb6f27b3108`

See more details on using hashes here.

concerto-multirobot 0.7.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

Quickstart

Architecture at a glance

Why this exists

How we sit relative to the closest prior work

The six heterogeneity axes CHAMBER measures

Repository layout

Leaderboard

Who this is for

Documentation

Roadmap

FAQ

Non-goals

Contributing

Stability & versioning

Citing CONCERTO & CHAMBER

Acknowledgments

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes