Benchmark any LLM on any hardware. CLI for the llm-speed.com flywheel.
Project description
llm-speed-web
Source for llm-speed.com — the canonical, crowdsourced source of truth for how fast LLMs actually run, across hosted APIs, consumer GPUs, and prosumer rigs.
Install
One-liners
- pipx (recommended for local-LLM users):
pipx install llm-speed - uv:
uv tool install llm-speed - Homebrew:
brew install llm-speed/tap/llm-speed(coming soon) - Docker:
docker run --rm -it llmspeed/llm-speed bench(coming soon) - npm:
npm install -g llm-speed(coming soon) - Standalone binary: download from Releases (coming soon)
Optional backends
- MLX (Apple Silicon):
pip install 'llm-speed[mlx]' - vLLM (NVIDIA):
pip install 'llm-speed[vllm]' - ExLlamaV2 (NVIDIA):
pip install 'llm-speed[exllamav2]' - llama.cpp + ollama are detected as binaries on PATH; no extras needed.
See docs/RELEASE.md for the publishing runbook (release steps + rollback).
Status
Phase 0 (seeding). The CLI and website come next; see docs/ for the brief and plan.
Layout
docs/ Strategic & design documents
BRIEF.md Project brief — why, who, monetization, phases
CLI.md llm-speed CLI requirements + portability strategy
DATA_SOURCES.md Folklore source inventory + scrape policy
MARKETING.md CLI launch & flywheel marketing strategy
db/
schema.sql SQLite seed schema (mirrors the eventual Postgres prod schema)
seed.sqlite Created on first run
seed/ The seeding pipeline (Python)
models.py Plain dataclasses (RawDocument, Claim)
db.py Connection + insert helpers
extractors.py Regex-based (model × hardware × backend × tok/s) extractor
reddit/client.py Reddit (PRAW) — r/LocalLLaMA + neighbors, full sweep
scrapers/hn.py Hacker News (Algolia API)
scrapers/openrouter.py OpenRouter API
scrapers/localscore.py LocalScore HTML + Next.js data
scrapers/artificial_analysis.py AA public pages (cross-reference)
scrapers/github.py GitHub issues/PRs in inference backends
scrapers/blogs.py Curated blog list
run.py Top-level orchestrator
Quick start
# 1. Install
python -m venv .venv && source .venv/bin/activate
pip install -e . # uses pyproject.toml
# 2. Set credentials (only what you have available; missing ones get skipped)
export REDDIT_CLIENT_ID=...
export REDDIT_CLIENT_SECRET=...
export REDDIT_USER_AGENT="llm-speed-seeder/0.1 (+https://llm-speed.com)"
# Reddit's API rules want a description + contact info. The project URL is
# the right contact for a project-operated bot — DO NOT put `by u/<handle>`
# here, since Reddit logs the user-agent on every request and that ties
# every scrape back to a personal handle.
export GITHUB_TOKEN=... # optional but strongly recommended
# 3. Smoke test
python -m seed.run --quick --only hn artificial_analysis blogs
# 4. Full sweep
python -m seed.run
# 5. Inspect what landed
python -m seed.run --stats
sqlite3 db/seed.sqlite \
'SELECT model_family, hardware_name, backend, AVG(decode_tps), COUNT(*)
FROM claims
WHERE confidence > 0.5
GROUP BY 1,2,3 ORDER BY 5 DESC LIMIT 30;'
Per-seeder usage
Each module is also runnable on its own:
python -m seed.reddit.client --quick # auth smoke test
python -m seed.scrapers.hn --query "Qwen3 Coder tok/s"
python -m seed.scrapers.openrouter --no-endpoints
python -m seed.scrapers.localscore --max-tests 50
python -m seed.scrapers.github --repo ggerganov/llama.cpp
python -m seed.scrapers.blogs --urls-file my_urls.txt
python -m seed.scrapers.artificial_analysis
Local CI (no GitHub Actions)
This project does its CI on the developer's machine — the lint / test /
smoke / API-roundtrip / build pipeline that used to run on every push
to GitHub Actions now lives at scripts/check.sh.
# Full check (~30s — lint, test, smoke, API roundtrip, build)
./scripts/check.sh
# Inner-loop fast pass (~10s — lint + test only)
./scripts/check.sh --quick
# Skip individual phases:
SKIP_BUILD=1 ./scripts/check.sh
SKIP_LINT=1 SKIP_BUILD=1 ./scripts/check.sh
To run it automatically on every git push (skippable per-push with
--no-verify), opt in to the bundled hook once per clone:
git config core.hooksPath .githooks
The pre-push hook runs --quick by default; CHECK_FULL=1 git push
runs the full check.
The previous .github/workflows/ci.yml was deleted; only manual /
release-event workflows remain (release.yml, daily-metrics.yml,
scheduled-seed.yml, reddit-poster.yml). None of them fire on push.
Design notes
- Idempotent. Re-running a seeder is safe; documents are uniqued by
(source, source_id). - Provenance preserved. Every claim links back to its source URL + author + scrape time. Folklore stays distinguishable from CLI-verified canonical results.
- Confidence-scored. Regex extraction caps at ~0.85; structured-API extraction reaches ~0.9; nothing in the seed phase counts as canonical (that's the CLI's job).
- Polite. All scrapers honor source-appropriate rate limits and identify themselves in
User-Agent. - No LLM calls in this phase. Heuristic extraction only. An LLM second-pass for ambiguous cases is a Phase 1 add-on (see
docs/CLI.md).
Next milestone
docs/CLI.md — design for the llm-speed benchmark CLI. The seeded folklore is
inventory; the CLI is the actual data-quality moat. Ship CLI in 2–4 weeks; if
adoption fails the kill criterion (see docs/MARKETING.md), stop before building
the website.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_speed-0.0.2.tar.gz.
File metadata
- Download URL: llm_speed-0.0.2.tar.gz
- Upload date:
- Size: 113.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
083c1abcc96ae5af331df7a01bd2a3b49da489eb5158d1de90008f1390312538
|
|
| MD5 |
51a00e7b133274e57c2e2b3d5d0e60e5
|
|
| BLAKE2b-256 |
8a6eeb23afdcba9ecd4c2c64bcac1ae2e21f3e900869710f314fc2328996d720
|
Provenance
The following attestation bundles were made for llm_speed-0.0.2.tar.gz:
Publisher:
release-pypi.yml on meadow-kun/llm-speed-web
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_speed-0.0.2.tar.gz -
Subject digest:
083c1abcc96ae5af331df7a01bd2a3b49da489eb5158d1de90008f1390312538 - Sigstore transparency entry: 1520538440
- Sigstore integration time:
-
Permalink:
meadow-kun/llm-speed-web@d0b9fd37e4be9b76f46152dbcfb62727b31ecff5 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/meadow-kun
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@d0b9fd37e4be9b76f46152dbcfb62727b31ecff5 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file llm_speed-0.0.2-py3-none-any.whl.
File metadata
- Download URL: llm_speed-0.0.2-py3-none-any.whl
- Upload date:
- Size: 134.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
842194d956847ff5c5dc9b4da5ec443f03f7527ef555a50b120915822f711f26
|
|
| MD5 |
e42aff382e6da95670b9bd780fd632e2
|
|
| BLAKE2b-256 |
1065f5ecc9a146efe28a69f2024f26cf1217271a0dc272291faf0c3ff316cdde
|
Provenance
The following attestation bundles were made for llm_speed-0.0.2-py3-none-any.whl:
Publisher:
release-pypi.yml on meadow-kun/llm-speed-web
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_speed-0.0.2-py3-none-any.whl -
Subject digest:
842194d956847ff5c5dc9b4da5ec443f03f7527ef555a50b120915822f711f26 - Sigstore transparency entry: 1520538484
- Sigstore integration time:
-
Permalink:
meadow-kun/llm-speed-web@d0b9fd37e4be9b76f46152dbcfb62727b31ecff5 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/meadow-kun
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-pypi.yml@d0b9fd37e4be9b76f46152dbcfb62727b31ecff5 -
Trigger Event:
workflow_dispatch
-
Statement type: