Skip to main content

Diagnostics for agent loops. Measure how fast each capability decays over long trajectories.

Project description

halftrace

Diagnostics for agent loops. Measure how fast each capability decays over long trajectories.

PyPI Python License

Agents fail in four distinct ways as trajectories get longer: they forget state, drift from instructions, repeat tool calls, and terminate prematurely. Each has a different halftrace - the trajectory length at which the behaviour is half-degraded. Existing eval frameworks measure task success at one point and miss all four curves. halftrace measures them directly.

The halftrace concept

A halftrace is the trajectory length, in tool calls, at which a given agent capability is degraded by 50% relative to its baseline at low N.

Different capabilities decay at different rates. A model's instruction-following might have a halftrace of 30 tool calls while its state recall has a halftrace of 150, meaning instruction adherence falls off five times faster than memory.

The library measures four halftraces per (agent, model, task):

Probe What it measures
state_amnesia Retention of facts introduced earlier in the trajectory
instruction_decay Adherence to system-prompt rules over time
tool_repetition Avoidance of re-calling tools with the same arguments
premature_termination Completing the task before declaring done
narration_substitution Emitting tool calls rather than describing them

What this is

A measurement library for agent trajectories. You bring the agent. halftrace instruments the trajectory and tells you which capability decays first.

The library was built to answer one question: when an agent fails on a long-horizon task, which failure mode caused it? The answer matters because the failure modes have completely different fixes: better prompting, better tools, better memory, or a different model, and you can't choose without measuring them separately.

What this isn't

  • Not a benchmark. No leaderboard, no canonical task set. You define the tasks.
  • Not an eval framework. If you want grading, scoring rubrics, or production observability, use Inspect, Braintrust, or Langsmith. halftrace is a diagnostic instrument that sits alongside them.
  • Not an agent framework. It doesn't build agents. It measures agents you've already built.

Install

Once 0.1.0 ships:

pip install halftrace

Optional extras:

pip install "halftrace[anthropic]"
pip install "halftrace[openai]"
pip install "halftrace[all]"

Requires Python 3.11+.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

halftrace-0.0.1.tar.gz (29.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

halftrace-0.0.1-py3-none-any.whl (3.5 kB view details)

Uploaded Python 3

File details

Details for the file halftrace-0.0.1.tar.gz.

File metadata

  • Download URL: halftrace-0.0.1.tar.gz
  • Upload date:
  • Size: 29.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for halftrace-0.0.1.tar.gz
Algorithm Hash digest
SHA256 98b17783ac7db82343edd7efa1456230c3c79a7e14325b60d5227b2f2bfa9206
MD5 d2a7da38179a6de94cf756dac8994895
BLAKE2b-256 ed93e89b1da6e06142417e7a4c9502639c5f5da7f8f085ad279fac3fa063466d

See more details on using hashes here.

File details

Details for the file halftrace-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: halftrace-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 3.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for halftrace-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a9acb333f65712c2d477e1696f460ec634b656acf8edc04e04f2432b5cd8851f
MD5 ce749242e6a6a94f3e79c9f5f61231d7
BLAKE2b-256 b819628bb63c8928f0d1e7b256fb9b576b786adae721a00e9215a19980940786

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page