Skip to main content

Evaluate whether LM-based SWE-agents can reverse-engineer black-box software systems

Project description

ProgramBench logo
ProgramBench

Can Language Models Rebuild Programs From Scratch?

Given only a compiled binary and its documentation, AI agents must architect and implement a complete codebase that reproduces the original program's behavior.

Links

Quickstart

We recommend uv for managing Python environments.

# Run without installing
uvx programbench --help

# Or install into a project
uv pip install programbench

# Or with pip
pip install programbench

For development:

git clone https://github.com/facebookresearch/programbench.git
cd programbench
uv sync  # installs editable + dev dependencies

[!NOTE] For more details, please refer to the Usage Guide.

Citation

If our work was useful for you, please cite it:

@misc{yang2026programbenchlanguagemodelsrebuild,
    title={ProgramBench: Can Language Models Rebuild Programs From Scratch?},
    author={John Yang and Kilian Lieret and Jeffrey Ma and Parth Thakkar and Dmitrii Pedchenko and Sten Sootla and Emily McMilin and Pengcheng Yin and Rui Hou and Gabriel Synnaeve and Diyi Yang and Ofir Press},
    year={2026},
    eprint={2605.03546},
    archivePrefix={arXiv},
    primaryClass={cs.SE},
    url={https://arxiv.org/abs/2605.03546},
}

License

ProgramBench is licensed under the terms of the license found in LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

programbench-1.0.2.tar.gz (3.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

programbench-1.0.2-py3-none-any.whl (3.4 MB view details)

Uploaded Python 3

File details

Details for the file programbench-1.0.2.tar.gz.

File metadata

  • Download URL: programbench-1.0.2.tar.gz
  • Upload date:
  • Size: 3.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for programbench-1.0.2.tar.gz
Algorithm Hash digest
SHA256 ce1d80e8e0a2f012b90b7b415cf4900f3062329df0c9b603bc92f686bdea163d
MD5 d69fc78d0c1c0daa377ba0d43ad7c772
BLAKE2b-256 f615092d5771877c8d7806890997faacee0aa07fd9bc41ed20294bc40b0e524a

See more details on using hashes here.

File details

Details for the file programbench-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: programbench-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 3.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for programbench-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 32acee06a0b56812539b543efa73053be55695f961e180c6aacac1d3204a9ff5
MD5 e14e896e5d896b876da1bcc726e7c986
BLAKE2b-256 64bf99a93cb295f34d638635967b112559323bc1fa253fc510ceede41ebe1353

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page