Skip to main content

CLI for authoring and running SWEAP benchmark tasks

Project description

SWEAP CLI

Command-line tooling for authoring, validating, and evaluating SWEAP benchmark tasks. Each task is a self-contained bundle containing repository metadata, guardrail tests, and a golden patch that can be reproduced locally or inside Modal sandboxes.

Quick Start

# optional: create a virtual environment
python3 -m venv .venv
source .venv/bin/activate

pip install --upgrade pip
pip install sweap-cli

# scaffold a new task bundle
task init --repo https://github.com/example/project.git --commit deadbeef

# iterate locally until guardrails behave as expected
task validate

# run the modal evaluation pipeline (baseline + model + patched verification)
task run --model codex

Required Credentials

  • SWEAP_API_URL and SWEAP_API_TOKEN for remote submissions and runs (request an API token from the SWEAP team).
  • OPENAI_API_KEY for Codex access (optional for local runs; mandatory for remote runs processed by our hosted worker).
  • modal CLI credentials (modal setup) if you plan to run Modal evaluations locally.

Add --runner node or --runner maven during task init to scaffold non-Python bundles. Use task validate --modal to reproduce validation inside Modal and task build to cache Modal environments for pytest bundles.

Core Commands

  • task init – scaffold manifests, guardrail directories, and dependency stubs.
  • task validate – run baseline vs. patched guardrails locally or in Modal.
  • task run – execute the full evaluation loop (baseline, model attempt, patched verification, optional full suite) locally or via the backend.
  • task submit – register/update tasks with the backend and upload bundle archives.
  • task build – prebuild Modal environments for pytest bundles.
  • task info / task fetch-bundle / task runs-get – inspect remote metadata, download bundles, and retrieve run artifacts.

See the CLI reference for detailed options.

Repository Highlights

  • src/sweap_cli/ – CLI entrypoint, runner implementations, Modal orchestration, and backend client.
  • api/ – FastAPI + Supabase service used for remote submissions and runs.
  • project-demo/ – sample task bundle used in integration tests.
  • docs/ – task workflows, architecture notes, and FAQs.

Need Help?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sweap_cli-0.1.9.tar.gz (53.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sweap_cli-0.1.9-py3-none-any.whl (46.7 kB view details)

Uploaded Python 3

File details

Details for the file sweap_cli-0.1.9.tar.gz.

File metadata

  • Download URL: sweap_cli-0.1.9.tar.gz
  • Upload date:
  • Size: 53.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for sweap_cli-0.1.9.tar.gz
Algorithm Hash digest
SHA256 5d67b2f257742b4ce8ed45f93ac545431fdd18a13c9628cccc96204a3a4af6cb
MD5 f37d2b368a0b4e44a87332fd22efb1c5
BLAKE2b-256 8f886e961f5569cea6515e54a176b97c0f2fd0a8d711d67908fc795e21671129

See more details on using hashes here.

File details

Details for the file sweap_cli-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: sweap_cli-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 46.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for sweap_cli-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 40b7b987fcab3afbbc8f733e59294b3883bf1e4257fe7e9f0748d9a85a20df33
MD5 d6d2b122fcde9aacf3c5b423831b6fbe
BLAKE2b-256 7def08483c2f5b01663958abbd14ee7db7d77042d52b36b64e1c5ba8a7f872c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page