Opinionated AI model benchmark aggregator — install via cargo: cargo install pondus
Project description
pondus
Opinionated AI model benchmark aggregator.
What it does
Aggregates AI model benchmark data from 8 trusted sources into a unified JSON schema. Designed for AI agents (Claude Code, etc.) to consume programmatically. Caches results for 24h to avoid rate limiting.
Sources
| Source | Type | Data |
|---|---|---|
| Artificial Analysis | REST API | Speed, quality, pricing metrics |
| LM Arena (LMSYS) | Community JSON | ELO ratings from human preferences |
| SWE-bench | GitHub JSON | Code generation resolve rates |
| SWE-rebench | agent-browser scrape | Code generation resolve rates (rebench variant) |
| Aider | GitHub YAML | Polyglot coding benchmark pass rates |
| LiveBench | HuggingFace | Multi-domain benchmark scores |
| Terminal-Bench | HuggingFace | Terminal/CLI task completion |
| SEAL | agent-browser scrape | Scale AI evaluation scores |
Note: Sources marked "agent-browser scrape" require the
agent-browserCLI installed separately. All other sources work out of the box.
Installation
cargo install pondus
Usage
pondus rank # rank all models (default command)
pondus # same as `pondus rank`
pondus rank --top 10 # top 10 only
pondus check claude-opus-4.6 # check one model across all sources
pondus compare gpt-5.2 claude-opus-4.6 # head-to-head comparison
pondus sources # show source status
pondus refresh # clear cache and re-fetch
Global Flags
| Flag | Description |
|---|---|
| `--format json | table |
--refresh |
Bypass cache for this run |
Configuration
Config location: ~/.config/pondus/config.toml
[cache]
ttl_hours = 24
[alias]
path = "models.toml" # relative to config dir, or absolute path
[sources.artificial_analysis]
api_key = "your-key" # optional, for AA source
[sources.agent_browser]
path = "agent-browser" # path to agent-browser CLI
Model Aliases
Different benchmarks use different naming conventions. models.toml maps canonical model names to source-specific variants:
[claude-opus-4_6]
canonical = "claude-opus-4.6"
aliases = [
"Claude Opus 4.6",
"claude-opus-4-6",
"anthropic/claude-opus-4.6",
"Opus 4.6",
]
When you run pondus check opus-4.6, pondus resolves the alias to the canonical name and matches it across all sources regardless of how each source names the model. PRs welcome to add new models.
Output Format
Default JSON output:
{
"timestamp": "2026-02-27T10:30:00Z",
"query": { "query_type": "rank" },
"sources": [
{
"source": "arena",
"status": "ok",
"scores": [
{ "model": "gpt-5.2", "rank": 1, "metrics": { "elo": 1350 } }
]
}
]
}
Contributing
- Add a model: Add an entry to
models.tomlwith canonical name and known aliases - Add a source: Implement the
Sourcetrait insrc/sources/
PRs welcome.
License
MIT
Sister Tools
Part of a family of AI-augmented CLI tools:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pondus-0.2.0.tar.gz.
File metadata
- Download URL: pondus-0.2.0.tar.gz
- Upload date:
- Size: 39.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fdf7a0d8bf40364a5f65a205e7204266ada6bdc8afac2bd64fe953e895643bdc
|
|
| MD5 |
c75393aef8e4bfe55851c0a4708cf731
|
|
| BLAKE2b-256 |
28ef9c217a803891f88717cb2cdfe53b56b217e474bcaa24606ba2fc4e77ab78
|
File details
Details for the file pondus-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pondus-0.2.0-py3-none-any.whl
- Upload date:
- Size: 3.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9df62d56f804d9a10be94a83e1e3a88547a93d1d9449b8b2da9cda968948fc7e
|
|
| MD5 |
580f4a526212aa73cfa20298a2d989aa
|
|
| BLAKE2b-256 |
8b5d99156dbd3273f8752a899b528eb73373551b58233d059de1310f8513025b
|