Skip to main content

Decision-contracts library: deterministic verdict/gate/failure/promotion logic for autoresearch loops.

Project description

autoresearch-core

CI PyPI Python License: MIT

A tiny, pure-Python decision-contracts library for autoresearch / agentic loops: a deterministic verdict (metric / comparator / target), failure classification, gates, and promotion record shapes — the disciplined decision core, with zero runtime dependencies and no I/O.

You bring the loop, the retrieval, the runner, and the storage; you bind them to the library's Protocols and call measure / decide / should_promote_dead_end at your decision points. The verdict logic is parity-tested against the GRD autoresearch loop.

Install

pip install autoresearch-core

Requires Python 3.11+. No runtime dependencies.

Quickstart

from autoresearch_core import (
    MetricSpec, ExperimentResult, measure, parse_metrics_line, should_promote_dead_end,
)

spec = MetricSpec(metric_key="recall_at_10", comparator=">=", target=0.8)

# An experiment prints `__RESULT__ {"recall_at_10": 0.83}` on stdout:
metrics = parse_metrics_line(stdout)        # -> {"recall_at_10": 0.83}
verdict = measure(spec, ExperimentResult(metrics=metrics, exit_code=0))

verdict.verdict          # "supported" | "refuted" | "inconclusive"  (deterministic)
verdict.evidence_level   # "deterministic"
should_promote_dead_end(verdict)            # True only for a deterministic refutation

What it owns (and what it doesn't)

Owns — the decision discipline:

  • MetricSpec + the __RESULT__ result contract (parse_metrics_line, validate_metric_spec)
  • DeterministicVerdict / measure (metric vs target → supported / refuted / inconclusive)
  • failure classification (classify_run_failureH2 / H3 / H4)
  • gates (resolve_gates, check_gate)
  • policy (decide, detect_plateau, should_promote_dead_end)
  • promotion record shapes (DeadEndRecord, KnowhowRecord, approach_hash, should_skip)

Doesn't own — bind these via ports.py Protocols to your own infra: Spawn, Retriever, KnowledgeGraph, ExperimentRunner, Store.

Verdict authority

DeterministicVerdict is the default and the reason this package exists. Other strategies (an LLM judge, an exit-code check) can be plugged in via the VerdictStrategy protocol, but only a deterministic refutation auto-promotes a dead-end — non-deterministic verdicts are advisory. Every verdict records its strategy and evidence_level, so the decision trail stays auditable.

Development

pip install -e ".[dev]"
pytest -q --cov=autoresearch_core

License

MIT © Cameleon X — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autoresearch_core-0.1.1.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

autoresearch_core-0.1.1-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file autoresearch_core-0.1.1.tar.gz.

File metadata

  • Download URL: autoresearch_core-0.1.1.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for autoresearch_core-0.1.1.tar.gz
Algorithm Hash digest
SHA256 690cb6156ba19df9802b13c36f691fde51dc55bfee45484c04b9742cbbe9b199
MD5 4ea87eb9dba637eaf280f033139c05bd
BLAKE2b-256 d40a4e415b9881b3acf2938864c6b0579babcf9cb70e198811d2bc267dbee46e

See more details on using hashes here.

Provenance

The following attestation bundles were made for autoresearch_core-0.1.1.tar.gz:

Publisher: publish.yml on ca1773130n/autoresearch-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file autoresearch_core-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for autoresearch_core-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 64fd824fca06e1ad3a0211b73d788c567c6aa7235855da6d172ac6c1977a5b4d
MD5 a68b664ee81a09f407575811506b3174
BLAKE2b-256 a48a351dece7bc32fdc866f94dd9e4e129a62e1a5ba7a2376c6b54897169ddf7

See more details on using hashes here.

Provenance

The following attestation bundles were made for autoresearch_core-0.1.1-py3-none-any.whl:

Publisher: publish.yml on ca1773130n/autoresearch-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page