Skip to main content

SDK contracts and clean promotion-gate CLI for Verifiable Labs: run modes, model-provider interface, typed schemas, and a property-tested formal-spec mirror.

Project description

vlabs-sdk

Verifiable Labs builds clean feedback and promotion gates for increasingly general AI agents.

SDK contracts for the Verifiable Labs platform: run configuration, the model-provider interface, and the typed schemas that evaluation contracts, score sets, gate outcomes, and assurance cards are built from.

  • pip package: vlabs-sdk
  • import package: vlabs_sdk
  • CLI: vlabs (clean promotion gate)

Install

pip install vlabs-sdk        # once published; until then:
pip install "vlabs-sdk @ git+https://github.com/verifiablelabs/vlabs-sdk@main"
from vlabs_sdk.providers.dummy_provider import DummyProvider
from vlabs_sdk.providers.base import ModelRequest
from vlabs_sdk.schemas import AssuranceCardV2, ScoreSet, TransferMetrics
from vlabs_sdk.run_config import default_config
from vlabs_sdk.formal_spec.clean_promotion_gate import accept_clean_update

Migrating from the legacy verifiable-labs-envs package? See MIGRATION.md.

clean-gate CLI

The vlabs CLI ships with the vlabs-sdk distribution — no extra install:

pip install vlabs-sdk
vlabs --help
vlabs clean-gate --old baseline.json --new candidate.json
# exit 0 = ACCEPT, exit 1 = REJECT (reasons printed)

The CLI is implemented in the bundled vlabs_prm_eval package (depends only on vlabs_sdk + typer); both import packages are included in the wheel.

What ships here

Surface Path
Run config — modes evaluate_only / gate_only / improve_and_gate / substrate, privacy-preserving defaults src/vlabs_sdk/run_config.py
Provider interface + dummy provider (validate_config / estimate_cost / run / dry_run) src/vlabs_sdk/providers/
Schemas — EvaluationContract, ScoreSet, TransferMetrics, GateOutcome, AssuranceCard v2, split policy src/vlabs_sdk/schemas/
Formal-spec math mirror (clean score, CleanVGS, generalization gap, 8-condition promotion gate) src/vlabs_sdk/formal_spec/
vlabs clean-gate CLI (ACCEPT exit 0 / REJECT exit 1) tools/vlabs-prm-eval/

This repository is a mirror of the canonical monorepo with the import namespace remapped (see PROVENANCE.md); it becomes canonical at split-flip time.

Formal scope

Selected mathematical properties behind the contamination-resistant promotion gate are machine-verified in Lean 4. The implementation is property-tested against the formal specification.

License

Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vlabs_sdk-0.0.2.tar.gz (43.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vlabs_sdk-0.0.2-py3-none-any.whl (49.4 kB view details)

Uploaded Python 3

File details

Details for the file vlabs_sdk-0.0.2.tar.gz.

File metadata

  • Download URL: vlabs_sdk-0.0.2.tar.gz
  • Upload date:
  • Size: 43.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for vlabs_sdk-0.0.2.tar.gz
Algorithm Hash digest
SHA256 f4e839591021b189b0442a1830aeab0c821d61f5f5bdd9e222e21868ae73bb2c
MD5 02a415d64a2638670a8283d0f2489d35
BLAKE2b-256 758878d56030a75a949674658f878ab86d5d1a95bb4f685f0d09ede0b5170271

See more details on using hashes here.

File details

Details for the file vlabs_sdk-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: vlabs_sdk-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 49.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for vlabs_sdk-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b6d5836ac5319c24f1141db07d6cf5b8ad93fb5c4181e088149da5ac7241a516
MD5 e3207d0cb1ad78529aab7a9a3b5624f4
BLAKE2b-256 687fa0207f434b5474b657b5c91c7f9a7c691ea2c8ebce89d23edb8e60431491

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page