A framework for running coding agents in sandboxed environments and keeping full trajectories for analysis.

These details have not been verified by PyPI

Project description

pier

Pier is a Harbor-compatible framework for evaluating coding agents in sandboxed environments. It reads Harbor's task format and runs trials against it.

pier run -p path/to/task --agent claude-code --env modal

Why pier

Pier is a fork. We wanted a smaller, more opinionated base to build on. On top of Harbor, Pier adds:

Installed agents in air-gapped tasks (allow_internet = false). When the agent runs inside the sandbox (Claude Code, Codex, etc.), both the install step and the inference call need the network. Pier lets agents declare their install scripts and a network allowlist, which docker and modal environments honor when setting up the sandbox.
Augmented ATIF v1.7. Strict one step per API turn, strict reasoning vs agent message separation, no fabricated assistant text, peak_context_tokens, summarization_count, llm_call_count, real upstream timestamps.
A chat-style trajectory viewer (pier view).
pier critique run for inspecting completed trials with a fresh agent in a fresh sandbox.

What works today

Task format: Harbor-compatible.
Environments: docker, modal. Per-agent install specs and network allowlists are honored on both, so installed agents work under allow_internet = false.
Agents: nop, oracle, claude-code, codex, gemini-cli, opencode, mini-swe-agent. All emit augmented ATIF v1.7.
Datasets: local Harbor-format task directories via -p / --path.
CLI: pier run, pier job, pier view, pier critique run, pier check / pier analyze (vendored from Harbor)

Pier does not currently resolve or download Harbor registry datasets directly.

Install

uv tool install datacurve-pier
# or
pip install datacurve-pier

Run

export ANTHROPIC_API_KEY=...
pier run -p path/to/task --agent claude-code --env modal --env-file .env

Run a local dataset, optionally a deterministic random subset:

pier run -p path/to/dataset --agent claude-code --env modal
pier run -p path/to/dataset --n-tasks 10 --sample-seed 0

To use a Harbor registry dataset, download it with Harbor first, then point Pier at it:

uv run --directory ~/code/harbor harbor download swebenchpro -o ~/code/pier/datasets
uv run pier run -p datasets/swebenchpro --n-tasks 10 --sample-seed 0

Trials land under jobs/<timestamp_or_name>/<trial_id>/. See pier run --help, pier job --help, pier critique --help, and pier view --help for everything else.

Agent runtime configuration

Use agent.model_name for trial metadata, agent.env for runtime env vars, and agent-specific kwargs for tool config. Pier's network allowlist also reads URLs out of those configs (Codex config_toml, OpenCode opencode_config, mini-swe config_yaml), so any base URL you set is allowlisted without code changes.

A few things we've learned plumbing this through Respan and OpenRouter:

Claude Code routes through the Anthropic face from Respan. Plan mode is disabled by default (--disallowedTools EnterPlanMode).

- name: claude-code
  model_name: claude-opus-4-7
  env:
    ANTHROPIC_AUTH_TOKEN: ${RESPAN_API_KEY}
    ANTHROPIC_BASE_URL: https://endpoint.respan.ai/api/anthropic
    ANTHROPIC_CUSTOM_HEADERS: "X-Respan-Route-Provider: vertex_ai"
  kwargs:
    reasoning_effort: max

Codex needs a [model_providers.<name>] block with wire_api = "responses" (not WebSockets, which Codex defaults to and Respan doesn't speak).

- name: codex
  model_name: openai/gpt-5.5
  env: { RESPAN_API_KEY: ${RESPAN_API_KEY} }
  kwargs:
    config_toml: |
      model_provider = "respan"
      [model_providers.respan]
      name = "Respan Gateway"
      base_url = "https://endpoint.respan.ai/api/"
      wire_api = "responses"
      env_key = "RESPAN_API_KEY"
    reasoning_effort: xhigh

Gemini CLI:

- name: gemini-cli
  model_name: gemini/gemini-3.1-pro-preview
  env:
    GEMINI_API_KEY: ${RESPAN_API_KEY}
    GOOGLE_GENERATIVE_AI_API_KEY: ${RESPAN_API_KEY}
    GEMINI_API_BASE: https://endpoint.respan.ai/api/google/vertexai/v1beta
    GOOGLE_GEMINI_BASE_URL: https://endpoint.respan.ai/api/google/vertexai/

OpenCode uses opencode_config to add unknown providers or override known ones. To redirect Google to Respan, override just options.baseURL; to add a fully custom provider, use opencode_config.provider.<name> with the npm package, options, and models.

mini-swe-agent picks a native adapter from the model-name prefix: openai/... → litellm_response (OpenAI Responses end-to-end), openrouter/... → openrouter (BYOK costs from cost_details.upstream_inference_cost), everything else → LiteLLM auto.

For Gemini 3 via mini-swe-agent/LiteLLM, omitting reasoning_effort uses the Gemini API default high/dynamic thinking level, but it does not request readable thought summaries. Set kwargs.reasoning_effort: high explicitly when you want LiteLLM to send includeThoughts and preserve returned summaries as reasoning content.

- name: mini-swe-agent
  model_name: openrouter/qwen/qwen3.6-plus
  env: { OPENROUTER_API_KEY: ${OPENROUTER_API_KEY} }
  kwargs:
    set_cache_control: default_end

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0

May 20, 2026

This version

0.1.0

May 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacurve_pier-0.1.0.tar.gz (753.5 kB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

datacurve_pier-0.1.0-py3-none-any.whl (824.9 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file datacurve_pier-0.1.0.tar.gz.

File metadata

Download URL: datacurve_pier-0.1.0.tar.gz
Upload date: May 20, 2026
Size: 753.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.27 {"installer":{"name":"uv","version":"0.9.27","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for datacurve_pier-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fecad2e5a4723caecca5c7ce36695f3daee59897b4b3f733d632a0d1dceeac6a`
MD5	`6eab7d8ecd9278d5ad766da849ee82a6`
BLAKE2b-256	`23609fc6bef0517875c5eda5779783b88217e76a6c532f65256f4d75f6d18859`

See more details on using hashes here.

File details

Details for the file datacurve_pier-0.1.0-py3-none-any.whl.

File metadata

Download URL: datacurve_pier-0.1.0-py3-none-any.whl
Upload date: May 20, 2026
Size: 824.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.27 {"installer":{"name":"uv","version":"0.9.27","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for datacurve_pier-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`117f4687d02343d88a328faae81e12ad5525e821d33ad559094fa97eee8f237d`
MD5	`4026bc13f5d067008c304f045c729e3b`
BLAKE2b-256	`71b529d388006c99c27ba68e2cc78bcb51a887927469c2ac6766b2f19a96a44a`

See more details on using hashes here.

datacurve-pier 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

pier

Why pier

What works today

Install

Run

Agent runtime configuration

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes