Skip to main content

Benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Project description

Auto-GPT Benchmark

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Scores:

Radio chart for each agent coming soon !

Detailed results

:warning: These results are constantly evolving at the moment. We will publish an official benchmark result very soon.

Interface

Task Auto-GPT gpt-engineer mini-agi smol-developer
Write File :x: :white_check_mark: tbd :white_check_mark:
Read File :x: :x: tbd :x:
Search File :x: :x: tbd :x:

Code

Task Auto-GPT gpt-engineer mini-agi smol-developer
Debug Simple Typo With Guidance :x: :x: tbd :x:
Debug Simple Typo Without Guidance :x: :x: tbd :x:
Basic Code Generation :x: :white_check_mark: tbd :white_check_mark:
Create Simple Web Server :x: :x: tbd :x:

Memory

Task Auto-GPT
Basic Memory :x:
Remember Multiple Ids :x:
Remember Multiple Ids With Noise :x:
Remember Multiple Phrases With Noise :x:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agbenchmark-0.0.1.tar.gz (31.2 kB view details)

Uploaded Source

Built Distribution

agbenchmark-0.0.1-py3-none-any.whl (81.0 kB view details)

Uploaded Python 3

File details

Details for the file agbenchmark-0.0.1.tar.gz.

File metadata

  • Download URL: agbenchmark-0.0.1.tar.gz
  • Upload date:
  • Size: 31.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.8.17 Linux/5.15.0-1041-azure

File hashes

Hashes for agbenchmark-0.0.1.tar.gz
Algorithm Hash digest
SHA256 7b52d3e6ee95a2ff3ec07ebb4aad78431193d70d11f0ed448bc3179761ec0707
MD5 d0faa111fcdaa6479008393f60343dbf
BLAKE2b-256 bb3e31b94fba5498ba4ac403f031b875b8e2a2eba2f5d18c3b63f81384c35c00

See more details on using hashes here.

File details

Details for the file agbenchmark-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: agbenchmark-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 81.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.8.17 Linux/5.15.0-1041-azure

File hashes

Hashes for agbenchmark-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5a1df99a7ef90977b2f9cd46d781384170738953adc20b725fd1a3c72ab179c1
MD5 1555ad98df1f0b8a205f9dd4326bb87c
BLAKE2b-256 4e50127458a4eb43c8a1db0a8928b08e5bb22c0c9fe6cef47e6371a489e840e7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page