Skip to main content

Benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Project description

Auto-GPT Benchmark

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Scores:

Radio chart for each agent coming soon !

Detailed results

:warning: These results are constantly evolving at the moment. We will publish an official benchmark result very soon.

Interface

Task Auto-GPT gpt-engineer mini-agi smol-developer
Write File :x: :white_check_mark: tbd :white_check_mark:
Read File :x: :x: tbd :x:
Search File :x: :x: tbd :x:

Code

Task Auto-GPT gpt-engineer mini-agi smol-developer
Debug Simple Typo With Guidance :x: :x: tbd :x:
Debug Simple Typo Without Guidance :x: :x: tbd :x:
Basic Code Generation :x: :white_check_mark: tbd :white_check_mark:
Create Simple Web Server :x: :x: tbd :x:

Memory

Task Auto-GPT
Basic Memory :x:
Remember Multiple Ids :x:
Remember Multiple Ids With Noise :x:
Remember Multiple Phrases With Noise :x:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agbenchmark-0.0.2.tar.gz (31.2 kB view details)

Uploaded Source

Built Distribution

agbenchmark-0.0.2-py3-none-any.whl (81.0 kB view details)

Uploaded Python 3

File details

Details for the file agbenchmark-0.0.2.tar.gz.

File metadata

  • Download URL: agbenchmark-0.0.2.tar.gz
  • Upload date:
  • Size: 31.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.8.17 Linux/5.15.0-1041-azure

File hashes

Hashes for agbenchmark-0.0.2.tar.gz
Algorithm Hash digest
SHA256 da9291db298ff0ebdd18b818545e84c932d724c3c5c9010b363cc496ceb5e7df
MD5 e0ea98cddcc75971a7caeb3a6c89a654
BLAKE2b-256 d6a59417ee6454a4eeb27abe301a01c22ea12763fd2764c3146d4d682c291b10

See more details on using hashes here.

File details

Details for the file agbenchmark-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: agbenchmark-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 81.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.8.17 Linux/5.15.0-1041-azure

File hashes

Hashes for agbenchmark-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 87959e1e364d77fe325b1f3967232c1cac1f71c909b0af60578fe8a2445dd76f
MD5 4ace191a32846db0b909c4bc8928eb48
BLAKE2b-256 3c8d6e1e38aaa1a14eba74bf4c7b9c174f0ca9a7203b08c428c1a47b42bd4bd3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page