Skip to main content

A development and evaluation framework for using language models to generate libraries.

Project description

Updates

Sep 28, 2024:

If you want to use agent with the OpenAI o1 models, please run these installation commands to update packages pip install git+https://github.com/wenting-zhao/aider.git.


Commit0

Commit0 is a from scratch AI coding challenge. Can you create a library from commit 0?

The benchmark consists of 57 core Python libraries. The challenge is to rebuild these libraries and pass their unit tests. All libraries have:

  • Significant test coverage
  • Detailed specification and documentation
  • Lint and type checking

Commit0 is an interactive environment that makes it easy to design and test new agents. You can:

  • Efficiently run tests in isolated environments
  • Distribute testing and development across cloud systems
  • Track and log all changes made throughout.

To install Commit0, run:

pip install commit0

Commit0 provides several commands to facilitate the process of cloning, building, testing, and evaluating repositories. Here's an overview of the available commands:

Setup

Use commit0 setup [OPTIONS] REPO_SPLIT to clone a repository split. Available options include:

Argument Type Description Default
repo_split str Split of repositories to clone
--dataset-name str Name of the Huggingface dataset wentingzhao/commit0_combined
--dataset-split str Split of the Huggingface dataset test
--base-dir str Base directory to clone repos to repos/
--commit0-dot-file-path str Storing path for stateful commit0 configs .commit0.yaml

Build

Use commit0 build [OPTIONS] to build the Commit0 split chosen in the Setup stage. Available options include:

Argument Type Description Default
--num-workers int Number of workers 8
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--verbose int Verbosity level (1 or 2) 1

Get Tests

Use commit0 get-tests REPO_NAME to get tests for a Commit0 repository.

Argument Type Description Default
repo_name str Name of the repository to get tests for

Test

Use commit0 test [OPTIONS] REPO_OR_REPO_PATH [TEST_IDS] to run tests on a Commit0 repository. Available options include:

Argument Type Description Default
repo_or_repo_path str Directory of the repository to test
test_ids str Test IDs to run
--branch str Branch to test
--backend str Backend to use for testing modal
--timeout int Timeout for tests in seconds 1800
--num-cpus int Number of CPUs to use 1
--reference bool Test the reference commit False
--coverage bool Get coverage information False
--rebuild bool Rebuild an image False
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--verbose int Verbosity level (1 or 2) 1
--stdin bool Read test names from stdin False

Evaluate

Use commit0 evaluate [OPTIONS] to evaluate the Commit0 split chosen in the Setup stage. Available options include:

Argument Type Description Default
--branch str Branch to evaluate
--backend str Backend to use for evaluation modal
--timeout int Timeout for evaluation in seconds 1800
--num-cpus int Number of CPUs to use 1
--num-workers int Number of workers to use 8
--reference bool Evaluate the reference commit False
--coverage bool Get coverage information False
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--rebuild bool Rebuild images False

Lint

Use commit0 lint [OPTIONS] REPO_OR_REPO_DIR to lint files in a repository. Available options include:

Argument Type Description Default
repo_or_repo_dir str Directory of the repository to test
--files List[Path] Files to lint (optional)
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--verbose int Verbosity level (1 or 2) 1

Save

Use commit0 save [OPTIONS] OWNER BRANCH to save the Commit0 split to GitHub. Available options include:

Argument Type Description Default
owner str Owner of the repository
branch str Branch to save
--github-token str GitHub token for authentication
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml

Agent

Config

Use agent config [OPTIONS] AGENT_NAME to set up the configuration for an agent. Available options include:

Argument Type Description Default
agent_name str Agent to use, we only support aider for now. aider
--model-name str LLM model to use, check here for all supported models. claude-3-5-sonnet-20240620
--use-user-prompt bool Use a custom prompt instead of the default prompt. False
--user-prompt str The prompt sent to agent. See code for details.
--run-tests bool Run tests after code modifications for feedback. You need to set up docker or modal before running tests, refer to commit0 docs. False
--max-iteration int Maximum number of agent iterations. 3
--use-repo-info bool Include the repository information. False
--max-repo-info-length int Maximum length of the repository information to use. 10000
--use-unit-tests-info bool Include the unit tests information. False
--max-unit-tests-info-length int Maximum length of the unit tests information to use. 10000
--use-spec-info bool Include the spec information. False
--max-spec-info-length int Maximum length of the spec information to use. 10000
--use-lint-info bool Include the lint information. False
--max-lint-info-length int Maximum length of the lint information to use. 10000
--pre-commit-config-path str Path to the pre-commit config file. This is needed for running lint. .pre-commit-config.yaml
--agent-config-file str Path to write the agent config. .agent.yaml

Running

Use agent run [OPTIONS] BRANCH to execute an agent on a specific branch. Available options include:

Argument Type Description Default
branch str Branch to run the agent on, you can specific the name of the branch
--backend str Test backend to run the agent on, ignore this option if you are not adding run_tests option to agent. modal
--log-dir str Log directory to store the logs. logs/aider
--max-parallel-repos int Maximum number of repositories for agent to run in parallel. Running in sequential if set to 1. 1
--display-repo-progress-num int Number of repo progress displayed when running. 5

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

commit0-0.1.5.tar.gz (25.2 MB view details)

Uploaded Source

Built Distribution

commit0-0.1.5-py3-none-any.whl (1.0 MB view details)

Uploaded Python 3

File details

Details for the file commit0-0.1.5.tar.gz.

File metadata

  • Download URL: commit0-0.1.5.tar.gz
  • Upload date:
  • Size: 25.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.16

File hashes

Hashes for commit0-0.1.5.tar.gz
Algorithm Hash digest
SHA256 3726c3d4e5965fbe942905af68914d8fe73c56ab9324baaa174d64778c5a0e74
MD5 2d97ccf23cfca9b6e7f594bd33d2d067
BLAKE2b-256 8b05ff65f06ec31ec9a89bf4d8056376216e7f916986ba0a1fe0d4b2e7812c77

See more details on using hashes here.

File details

Details for the file commit0-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: commit0-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.16

File hashes

Hashes for commit0-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 171037938d419860ebe471bf41dd1b8490dd439dd62179c5ad8c59869c046abe
MD5 26344b373b4e351dbe6c3c26fa06fa1a
BLAKE2b-256 830d5210169ee8645003a42c08c4ce03f6855015552353701c64ef4f200f826c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page