Benchmark any LLM on any hardware. CLI for the llm-speed.com flywheel.

These details have not been verified by PyPI

Project links

Project description

llm-speed-web

Source for llm-speed.com — the canonical, crowdsourced source of truth for how fast LLMs actually run, across hosted APIs, consumer GPUs, and prosumer rigs.

Install

One-liners

pipx (recommended for local-LLM users): pipx install llm-speed
uv: uv tool install llm-speed
Homebrew: brew install llm-speed/tap/llm-speed (coming soon)
Docker: docker run --rm -it llmspeed/llm-speed bench (coming soon)
npm: npm install -g llm-speed (coming soon)
Standalone binary: download from Releases (coming soon)

Optional backends

MLX (Apple Silicon): pip install 'llm-speed[mlx]'
vLLM (NVIDIA): pip install 'llm-speed[vllm]'
ExLlamaV2 (NVIDIA): pip install 'llm-speed[exllamav2]'
llama.cpp + ollama are detected as binaries on PATH; no extras needed.

See docs/RELEASE.md for the publishing runbook (release steps + rollback).

Status

Phase 0 (seeding). The CLI and website come next; see docs/ for the brief and plan.

Layout

docs/                Strategic & design documents
  BRIEF.md           Project brief — why, who, monetization, phases
  CLI.md             llm-speed CLI requirements + portability strategy
  DATA_SOURCES.md    Folklore source inventory + scrape policy
  MARKETING.md       CLI launch & flywheel marketing strategy

db/
  schema.sql         SQLite seed schema (mirrors the eventual Postgres prod schema)
  seed.sqlite        Created on first run

seed/                The seeding pipeline (Python)
  models.py            Plain dataclasses (RawDocument, Claim)
  db.py                Connection + insert helpers
  extractors.py        Regex-based (model × hardware × backend × tok/s) extractor
  reddit/client.py     Reddit (PRAW) — r/LocalLLaMA + neighbors, full sweep
  scrapers/hn.py             Hacker News (Algolia API)
  scrapers/openrouter.py     OpenRouter API
  scrapers/localscore.py     LocalScore HTML + Next.js data
  scrapers/artificial_analysis.py   AA public pages (cross-reference)
  scrapers/github.py         GitHub issues/PRs in inference backends
  scrapers/blogs.py          Curated blog list
  run.py               Top-level orchestrator

Quick start

# 1. Install
python -m venv .venv && source .venv/bin/activate
pip install -e .                       # uses pyproject.toml

# 2. Set credentials (only what you have available; missing ones get skipped)
export REDDIT_CLIENT_ID=...
export REDDIT_CLIENT_SECRET=...
export REDDIT_USER_AGENT="llm-speed-seeder/0.1 (+https://llm-speed.com)"
# Reddit's API rules want a description + contact info. The project URL is
# the right contact for a project-operated bot — DO NOT put `by u/<handle>`
# here, since Reddit logs the user-agent on every request and that ties
# every scrape back to a personal handle.
export GITHUB_TOKEN=...                # optional but strongly recommended

# 3. Smoke test
python -m seed.run --quick --only hn artificial_analysis blogs

# 4. Full sweep
python -m seed.run

# 5. Inspect what landed
python -m seed.run --stats
sqlite3 db/seed.sqlite \
  'SELECT model_family, hardware_name, backend, AVG(decode_tps), COUNT(*)
   FROM claims
   WHERE confidence > 0.5
   GROUP BY 1,2,3 ORDER BY 5 DESC LIMIT 30;'

Per-seeder usage

Each module is also runnable on its own:

python -m seed.reddit.client --quick                      # auth smoke test
python -m seed.scrapers.hn --query "Qwen3 Coder tok/s"
python -m seed.scrapers.openrouter --no-endpoints
python -m seed.scrapers.localscore --max-tests 50
python -m seed.scrapers.github --repo ggerganov/llama.cpp
python -m seed.scrapers.blogs --urls-file my_urls.txt
python -m seed.scrapers.artificial_analysis

Local CI (no GitHub Actions)

This project does its CI on the developer's machine — the lint / test / smoke / API-roundtrip / build pipeline that used to run on every push to GitHub Actions now lives at scripts/check.sh.

# Full check (~30s — lint, test, smoke, API roundtrip, build)
./scripts/check.sh

# Inner-loop fast pass (~10s — lint + test only)
./scripts/check.sh --quick

# Skip individual phases:
SKIP_BUILD=1 ./scripts/check.sh
SKIP_LINT=1 SKIP_BUILD=1 ./scripts/check.sh

To run it automatically on every git push (skippable per-push with --no-verify), opt in to the bundled hook once per clone:

git config core.hooksPath .githooks

The pre-push hook runs --quick by default; CHECK_FULL=1 git push runs the full check.

The previous .github/workflows/ci.yml was deleted; only manual / release-event workflows remain (release.yml, daily-metrics.yml, scheduled-seed.yml, reddit-poster.yml). None of them fire on push.

Design notes

Idempotent. Re-running a seeder is safe; documents are uniqued by (source, source_id).
Provenance preserved. Every claim links back to its source URL + author + scrape time. Folklore stays distinguishable from CLI-verified canonical results.
Confidence-scored. Regex extraction caps at ~0.85; structured-API extraction reaches ~0.9; nothing in the seed phase counts as canonical (that's the CLI's job).
Polite. All scrapers honor source-appropriate rate limits and identify themselves in User-Agent.
No LLM calls in this phase. Heuristic extraction only. An LLM second-pass for ambiguous cases is a Phase 1 add-on (see docs/CLI.md).

Next milestone

docs/CLI.md — design for the llm-speed benchmark CLI. The seeded folklore is inventory; the CLI is the actual data-quality moat. Ship CLI in 2–4 weeks; if adoption fails the kill criterion (see docs/MARKETING.md), stop before building the website.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.0.3

May 12, 2026

This version

0.0.2

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_speed-0.0.2.tar.gz (113.0 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_speed-0.0.2-py3-none-any.whl (134.3 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file llm_speed-0.0.2.tar.gz.

File metadata

Download URL: llm_speed-0.0.2.tar.gz
Upload date: May 12, 2026
Size: 113.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for llm_speed-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`083c1abcc96ae5af331df7a01bd2a3b49da489eb5158d1de90008f1390312538`
MD5	`51a00e7b133274e57c2e2b3d5d0e60e5`
BLAKE2b-256	`8a6eeb23afdcba9ecd4c2c64bcac1ae2e21f3e900869710f314fc2328996d720`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_speed-0.0.2.tar.gz:

Publisher: release-pypi.yml on meadow-kun/llm-speed-web

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_speed-0.0.2.tar.gz
- Subject digest: 083c1abcc96ae5af331df7a01bd2a3b49da489eb5158d1de90008f1390312538
- Sigstore transparency entry: 1520538440
- Sigstore integration time: May 12, 2026
Source repository:
- Permalink: meadow-kun/llm-speed-web@d0b9fd37e4be9b76f46152dbcfb62727b31ecff5
- Branch / Tag: refs/heads/main
- Owner: https://github.com/meadow-kun
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-pypi.yml@d0b9fd37e4be9b76f46152dbcfb62727b31ecff5
- Trigger Event: workflow_dispatch

File details

Details for the file llm_speed-0.0.2-py3-none-any.whl.

File metadata

Download URL: llm_speed-0.0.2-py3-none-any.whl
Upload date: May 12, 2026
Size: 134.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for llm_speed-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`842194d956847ff5c5dc9b4da5ec443f03f7527ef555a50b120915822f711f26`
MD5	`e42aff382e6da95670b9bd780fd632e2`
BLAKE2b-256	`1065f5ecc9a146efe28a69f2024f26cf1217271a0dc272291faf0c3ff316cdde`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_speed-0.0.2-py3-none-any.whl:

Publisher: release-pypi.yml on meadow-kun/llm-speed-web

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_speed-0.0.2-py3-none-any.whl
- Subject digest: 842194d956847ff5c5dc9b4da5ec443f03f7527ef555a50b120915822f711f26
- Sigstore transparency entry: 1520538484
- Sigstore integration time: May 12, 2026
Source repository:
- Permalink: meadow-kun/llm-speed-web@d0b9fd37e4be9b76f46152dbcfb62727b31ecff5
- Branch / Tag: refs/heads/main
- Owner: https://github.com/meadow-kun
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-pypi.yml@d0b9fd37e4be9b76f46152dbcfb62727b31ecff5
- Trigger Event: workflow_dispatch

llm-speed 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llm-speed-web

Install

One-liners

Optional backends

Status

Layout

Quick start

Per-seeder usage

Local CI (no GitHub Actions)

Design notes

Next milestone

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance