Skip to main content

A development and evaluation framework for using language models to generate libraries.

Project description

Commit0

Commit0 is a from scratch AI coding challenge. Can you create a library from commit 0?

The benchmark consists of 57 core Python libraries. The challenge is to rebuild these libraries and pass their unit tests. All libraries have:

  • Significant test coverage
  • Detailed specification and documentation
  • Lint and type checking

Commit0 is an interactive environment that makes it easy to design and test new agents. You can:

  • Efficiently run tests in isolated environments
  • Distribute testing and development across cloud systems
  • Track and log all changes made throughout.

To install Commit0, run:

pip install commit0

Commit0 provides several commands to facilitate the process of cloning, building, testing, and evaluating repositories. Here's an overview of the available commands:

Setup

Use commit0 setup [OPTIONS] REPO_SPLIT to clone a repository split. Available options include:

Argument Type Description Default
repo_split str Split of repositories to clone
--dataset-name str Name of the Huggingface dataset wentingzhao/commit0_combined
--dataset-split str Split of the Huggingface dataset test
--base-dir str Base directory to clone repos to repos/
--commit0-dot-file-path str Storing path for stateful commit0 configs .commit0.yaml

Build

Use commit0 build [OPTIONS] to build the Commit0 split chosen in the Setup stage. Available options include:

Argument Type Description Default
--num-workers int Number of workers 8
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--verbose int Verbosity level (1 or 2) 1

Get Tests

Use commit0 get-tests REPO_NAME to get tests for a Commit0 repository.

Argument Type Description Default
repo_name str Name of the repository to get tests for

Test

Use commit0 test [OPTIONS] REPO_OR_REPO_PATH [TEST_IDS] to run tests on a Commit0 repository. Available options include:

Argument Type Description Default
repo_or_repo_path str Directory of the repository to test
test_ids str Test IDs to run
--branch str Branch to test
--backend str Backend to use for testing modal
--timeout int Timeout for tests in seconds 1800
--num-cpus int Number of CPUs to use 1
--reference bool Test the reference commit False
--coverage bool Get coverage information False
--rebuild bool Rebuild an image False
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--verbose int Verbosity level (1 or 2) 1
--stdin bool Read test names from stdin False

Evaluate

Use commit0 evaluate [OPTIONS] to evaluate the Commit0 split chosen in the Setup stage. Available options include:

Argument Type Description Default
--branch str Branch to evaluate
--backend str Backend to use for evaluation modal
--timeout int Timeout for evaluation in seconds 1800
--num-cpus int Number of CPUs to use 1
--num-workers int Number of workers to use 8
--reference bool Evaluate the reference commit False
--coverage bool Get coverage information False
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--rebuild bool Rebuild images False

Lint

Use commit0 lint [OPTIONS] REPO_OR_REPO_DIR to lint files in a repository. Available options include:

Argument Type Description Default
repo_or_repo_dir str Directory of the repository to test
--files List[Path] Files to lint (optional)
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--verbose int Verbosity level (1 or 2) 1

Save

Use commit0 save [OPTIONS] OWNER BRANCH to save the Commit0 split to GitHub. Available options include:

Argument Type Description Default
owner str Owner of the repository
branch str Branch to save
--github-token str GitHub token for authentication
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml

Agent

Config

Use agent config [OPTIONS] AGENT_NAME to set up the configuration for an agent. Available options include:

Argument Type Description Default
agent_name str Agent to use, we only support aider for now. aider
--model-name str LLM model to use, check here for all supported models. claude-3-5-sonnet-20240620
--use-user-prompt bool Use a custom prompt instead of the default prompt. False
--user-prompt str The prompt sent to agent. See code for details.
--run-tests bool Run tests after code modifications for feedback. You need to set up docker or modal before running tests, refer to commit0 docs. False
--max-iteration int Maximum number of agent iterations. 3
--use-repo-info bool Include the repository information. False
--max-repo-info-length int Maximum length of the repository information to use. 10000
--use-unit-tests-info bool Include the unit tests information. False
--max-unit-tests-info-length int Maximum length of the unit tests information to use. 10000
--use-spec-info bool Include the spec information. False
--max-spec-info-length int Maximum length of the spec information to use. 10000
--use-lint-info bool Include the lint information. False
--max-lint-info-length int Maximum length of the lint information to use. 10000
--pre-commit-config-path str Path to the pre-commit config file. This is needed for running lint. .pre-commit-config.yaml
--agent-config-file str Path to write the agent config. .agent.yaml

Running

Use agent run [OPTIONS] BRANCH to execute an agent on a specific branch. Available options include:

Argument Type Description Default
branch str Branch to run the agent on, you can specific the name of the branch
--backend str Test backend to run the agent on, ignore this option if you are not adding run_tests option to agent. modal
--log-dir str Log directory to store the logs. logs/aider
--max-parallel-repos int Maximum number of repositories for agent to run in parallel. Running in sequential if set to 1. 1
--display-repo-progress-num int Number of repo progress displayed when running. 5

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

commit0-0.1.4.tar.gz (25.2 MB view details)

Uploaded Source

Built Distribution

commit0-0.1.4-py3-none-any.whl (1.0 MB view details)

Uploaded Python 3

File details

Details for the file commit0-0.1.4.tar.gz.

File metadata

  • Download URL: commit0-0.1.4.tar.gz
  • Upload date:
  • Size: 25.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.16

File hashes

Hashes for commit0-0.1.4.tar.gz
Algorithm Hash digest
SHA256 b4a90970d64ba5682c2b891a814d45ccde547214eff554e523754ff79cdc62c7
MD5 c075d6525f3fd850389ff9dd46b8f173
BLAKE2b-256 1356d1b2193e42482335a3654bce069c9fee5ca1d950b947fe7b7024609e38c6

See more details on using hashes here.

File details

Details for the file commit0-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: commit0-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.16

File hashes

Hashes for commit0-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 bfc40fa207441e2837cfafc06d794f3d40926f41eb303eb78c48a82745b64aa6
MD5 809b22e4f5cd7bf69f96b7b64cd67370
BLAKE2b-256 db042d06fba9eee1ca9f61f121932bd8257f01657721c107e753849d259d258a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page