Skip to main content

Evaluate whether LM-based SWE-agents can reverse-engineer black-box software systems

Project description

ProgramBench logo
ProgramBench

Can Language Models Rebuild Programs From Scratch?

Given only a compiled binary and its documentation, AI agents must architect and implement a complete codebase that reproduces the original program's behavior.

Links

Quickstart

We recommend uv for managing Python environments.

# Run without installing
uvx programbench --help

# Or install into a project
uv pip install programbench

# Or with pip
pip install programbench

For development:

git clone https://github.com/facebookresearch/programbench.git
cd programbench
uv sync  # installs editable + dev dependencies

[!NOTE] For more details, please refer to the Usage Guide.

Citation

If our work was useful for you, please cite it:

@misc{yang2026programbenchlanguagemodelsrebuild,
    title={ProgramBench: Can Language Models Rebuild Programs From Scratch?},
    author={John Yang and Kilian Lieret and Jeffrey Ma and Parth Thakkar and Dmitrii Pedchenko and Sten Sootla and Emily McMilin and Pengcheng Yin and Rui Hou and Gabriel Synnaeve and Diyi Yang and Ofir Press},
    year={2026},
    eprint={2605.03546},
    archivePrefix={arXiv},
    primaryClass={cs.SE},
    url={https://arxiv.org/abs/2605.03546},
}

License

ProgramBench is licensed under the terms of the license found in LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

programbench-1.0.1.tar.gz (3.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

programbench-1.0.1-py3-none-any.whl (3.4 MB view details)

Uploaded Python 3

File details

Details for the file programbench-1.0.1.tar.gz.

File metadata

  • Download URL: programbench-1.0.1.tar.gz
  • Upload date:
  • Size: 3.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for programbench-1.0.1.tar.gz
Algorithm Hash digest
SHA256 94a07f90cfeffaf1c3b94d7818cb463d4b62b1a1e9a4645246490e3e6a57cee6
MD5 8f79d35770c76e0e5563b95e9c97ca96
BLAKE2b-256 ad7c7daa7149b0b56de83d98cd1035e6863ce85dd8846f9da88b542dbb5932ab

See more details on using hashes here.

File details

Details for the file programbench-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: programbench-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for programbench-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4665384c8ecab5d7f13a730eb2e059559e228d8adcc37fcd18b39fad6fd06815
MD5 e94cc9aa75caa4fc7b44b61c2f654854
BLAKE2b-256 f43b8f23c59ff8506bfb1cd08b3c1a33a3ef8a49afd224225cc68984c9e61ea9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page