Skip to main content

A development and evaluation framework for using language models to generate libraries.

Project description

Updates

Sep 28, 2024:

If you want to use agent with the OpenAI o1 models, please run these installation commands to update packages pip install git+https://github.com/wenting-zhao/aider.git.


Commit0

Commit0 is a from scratch AI coding challenge. Can you create a library from commit 0?

The benchmark consists of 57 core Python libraries. The challenge is to rebuild these libraries and pass their unit tests. All libraries have:

  • Significant test coverage
  • Detailed specification and documentation
  • Lint and type checking

Commit0 is an interactive environment that makes it easy to design and test new agents. You can:

  • Efficiently run tests in isolated environments
  • Distribute testing and development across cloud systems
  • Track and log all changes made throughout.

To install Commit0, run:

pip install commit0

Commit0 provides several commands to facilitate the process of cloning, building, testing, and evaluating repositories. Here's an overview of the available commands:

Setup

Use commit0 setup [OPTIONS] REPO_SPLIT to clone a repository split. Available options include:

Argument Type Description Default
repo_split str Split of repositories to clone
--dataset-name str Name of the Huggingface dataset wentingzhao/commit0_combined
--dataset-split str Split of the Huggingface dataset test
--base-dir str Base directory to clone repos to repos/
--commit0-dot-file-path str Storing path for stateful commit0 configs .commit0.yaml

Build

Use commit0 build [OPTIONS] to build the Commit0 split chosen in the Setup stage. Available options include:

Argument Type Description Default
--num-workers int Number of workers 8
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--verbose int Verbosity level (1 or 2) 1

Get Tests

Use commit0 get-tests REPO_NAME to get tests for a Commit0 repository.

Argument Type Description Default
repo_name str Name of the repository to get tests for

Test

Use commit0 test [OPTIONS] REPO_OR_REPO_PATH [TEST_IDS] to run tests on a Commit0 repository. Available options include:

Argument Type Description Default
repo_or_repo_path str Directory of the repository to test
test_ids str Test IDs to run
--branch str Branch to test
--backend str Backend to use for testing modal
--timeout int Timeout for tests in seconds 1800
--num-cpus int Number of CPUs to use 1
--reference bool Test the reference commit False
--coverage bool Get coverage information False
--rebuild bool Rebuild an image False
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--verbose int Verbosity level (1 or 2) 1
--stdin bool Read test names from stdin False

Evaluate

Use commit0 evaluate [OPTIONS] to evaluate the Commit0 split chosen in the Setup stage. Available options include:

Argument Type Description Default
--branch str Branch to evaluate
--backend str Backend to use for evaluation modal
--timeout int Timeout for evaluation in seconds 1800
--num-cpus int Number of CPUs to use 1
--num-workers int Number of workers to use 8
--reference bool Evaluate the reference commit False
--coverage bool Get coverage information False
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--rebuild bool Rebuild images False

Lint

Use commit0 lint [OPTIONS] REPO_OR_REPO_DIR to lint files in a repository. Available options include:

Argument Type Description Default
repo_or_repo_dir str Directory of the repository to test
--files List[Path] Files to lint (optional)
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml
--verbose int Verbosity level (1 or 2) 1

Save

Use commit0 save [OPTIONS] OWNER BRANCH to save the Commit0 split to GitHub. Available options include:

Argument Type Description Default
owner str Owner of the repository
branch str Branch to save
--github-token str GitHub token for authentication
--commit0-dot-file-path str Path to the commit0 dot file .commit0.yaml

Agent

Config

Use agent config [OPTIONS] AGENT_NAME to set up the configuration for an agent. Available options include:

Argument Type Description Default
agent_name str Agent to use, we only support aider for now. aider
--model-name str LLM model to use, check here for all supported models. claude-3-5-sonnet-20240620
--use-user-prompt bool Use a custom prompt instead of the default prompt. False
--user-prompt str The prompt sent to agent. See code for details.
--run-tests bool Run tests after code modifications for feedback. You need to set up docker or modal before running tests, refer to commit0 docs. False
--max-iteration int Maximum number of agent iterations. 3
--use-repo-info bool Include the repository information. False
--max-repo-info-length int Maximum length of the repository information to use. 10000
--use-unit-tests-info bool Include the unit tests information. False
--max-unit-tests-info-length int Maximum length of the unit tests information to use. 10000
--use-spec-info bool Include the spec information. False
--max-spec-info-length int Maximum length of the spec information to use. 10000
--use-lint-info bool Include the lint information. False
--max-lint-info-length int Maximum length of the lint information to use. 10000
--pre-commit-config-path str Path to the pre-commit config file. This is needed for running lint. .pre-commit-config.yaml
--agent-config-file str Path to write the agent config. .agent.yaml

Running

Use agent run [OPTIONS] BRANCH to execute an agent on a specific branch. Available options include:

Argument Type Description Default
branch str Branch to run the agent on, you can specific the name of the branch
--backend str Test backend to run the agent on, ignore this option if you are not adding run_tests option to agent. modal
--log-dir str Log directory to store the logs. logs/aider
--max-parallel-repos int Maximum number of repositories for agent to run in parallel. Running in sequential if set to 1. 1
--display-repo-progress-num int Number of repo progress displayed when running. 5

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

commit0-0.1.6.tar.gz (25.2 MB view details)

Uploaded Source

Built Distribution

commit0-0.1.6-py3-none-any.whl (1.0 MB view details)

Uploaded Python 3

File details

Details for the file commit0-0.1.6.tar.gz.

File metadata

  • Download URL: commit0-0.1.6.tar.gz
  • Upload date:
  • Size: 25.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.16

File hashes

Hashes for commit0-0.1.6.tar.gz
Algorithm Hash digest
SHA256 cdd1476b22cad8790686cc2df65fc7eac6c43f05da2b9d0a93f4eb354145616e
MD5 52cb5e9bea410e12a37c9baafc9c4b90
BLAKE2b-256 fc4acf09345490fc7eb18485089d81d53342bcbf8e7a29b06ffbfc008e7fe45f

See more details on using hashes here.

File details

Details for the file commit0-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: commit0-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.16

File hashes

Hashes for commit0-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 db63cd9d8de2c746b69feee9bc18909b354870a1fe3f7f6904a91f93aba32397
MD5 614d57878d4442af2049373a5a1dce69
BLAKE2b-256 37d15aa1f2d61a6119c7163058cf7e0764a48be6b479d4e1cfe83be532a735a6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page