Skip to main content

Single-turn procedural code-execution RL environment with sandboxed pytest scoring and conformal coverage

Project description

verifiable-labs-code-humaneval

Single-turn procedural code-execution RL environment from the Verifiable Labs catalogue. Each instance carries a function signature

  • docstring + a small visible test set; the model returns Python source which the env runs against a hidden test battery inside a sandboxed subprocess (D5 limits: 512 MB / 30 s wall / 20 s CPU / unshare -rn network isolation / 16-process fanout cap).
Component Weight What it rewards
format_valid 0.10 Output is parseable JSON containing a code field
parse_valid 0.20 Extracted code compiles via compile(..., "exec")
pass_rate 0.70 Fraction of (visible ∪ hidden) pytest cases that passed

12 procedural templates across lists, strings, dicts, ints, trees, and graphs — EFFECTIVE_INSTANCES > 7e23, well above the contamination-resistance gate.

Install

pip install verifiable-labs-code-humaneval

Use

from verifiable_labs_code_humaneval import load_environment

env = load_environment(calibration_quantile=0.5)
inst = env.generate_instance(seed=42)
print(inst.prompt)

Source of truth + full docs: https://github.com/stelioszach03/verifiable-labs-envs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

verifiable_labs_code_humaneval-0.1.1.tar.gz (3.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file verifiable_labs_code_humaneval-0.1.1.tar.gz.

File metadata

File hashes

Hashes for verifiable_labs_code_humaneval-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e77c4e70785cf16c69c136b953952ce3b79c3036ca9df17d6154e61066b5a9d8
MD5 2e1a2b4616f0aeb72d45ecf1f5bb7d8e
BLAKE2b-256 d45318165c633491df98dcc46148a07ac3c4fb501cf2fa9bb417287812c37d67

See more details on using hashes here.

File details

Details for the file verifiable_labs_code_humaneval-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for verifiable_labs_code_humaneval-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5cb7b5e27feea2bf337190423e52a105cd1e9abdfea48583e6f9b4f485837812
MD5 75e8c8c2ef54817490777a5d04914d4d
BLAKE2b-256 625909e6cfaeaccf22f56c14abd374a9341dd3d4d117ab0ec1cd87a9d6788cb6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page