Skip to main content

Multi-turn procedural code-execution RL environment with test-feedback rollouts and conformal coverage

Project description

verifiable-labs-code-humaneval-multiturn

Multi-turn procedural code-execution RL environment from the Verifiable Labs catalogue. Same problem distribution as code-humaneval, but the model gets up to 3 turns with visible-test feedback between them.

Turn What the model sees What it returns
1 Function signature + docstring + visible test block First implementation
2 Visible test pass/fail counts (no test source, no oracle) Revised implementation
3 Same — final attempt scored against visible ∪ hidden tests Final implementation

A turn-count penalty of 5% per extra turn (capped at 10%) keeps multi-turn from being a free win — three turns scores 0.9× the equivalent single-turn reward. Hidden tests are never shown to the model (R10 — visible test pass count is the only feedback signal).

Install

pip install verifiable-labs-code-humaneval-multiturn

Source of truth + full docs: https://github.com/stelioszach03/verifiable-labs-envs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file verifiable_labs_code_humaneval_multiturn-0.1.1.tar.gz.

File metadata

File hashes

Hashes for verifiable_labs_code_humaneval_multiturn-0.1.1.tar.gz
Algorithm Hash digest
SHA256 583759be7456385458deb8ac44e0d1c650acffecf5419ee85addc7973cfe91af
MD5 e53a2fb4dff82cfcc3f068cfcdbca324
BLAKE2b-256 26616cb72afcddb7be2a036ce1855e89fd6385de57bd2d601e169877da3aff48

See more details on using hashes here.

File details

Details for the file verifiable_labs_code_humaneval_multiturn-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for verifiable_labs_code_humaneval_multiturn-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 77a217d3ebcebd8a525b968629ecd07f75459779f2a4359912d8f49b209b731e
MD5 096fc0a4ae266c64e1cf56bcc6a27969
BLAKE2b-256 7c23ee5ca9c1e2177dcf89a8b5d63fa87150dfc5f7fbdda42e5fdff2f293c92e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page