Python client + typed contract for the BenchHub benchmarking platform
Project description
BenchHub
BenchHub is an open-source benchmarking platform: pick a dataset, define metrics in Python, upload predictions, and see how your model ranks. Live at https://runbenchhub.com.
Originally built as a private dTOF SPAD pipeline benchmarking tool, then generalized into a public, multi-tenant web app.
Features
- OAuth sign-in (GitHub) — no passwords; one-click account creation.
- Datasets and leaderboards are global — no project namespace.
- Per-row visibility (
public/unlisted/private) on datasets, leaderboards, and metric/visualization library entries. - HuggingFace import: pull a structured HF dataset repo as a one-click
alternative to a ZIP upload (see
scripts/seed_nyu_v2_curated.pyfor an example workflow). - User-defined metrics in Python — bring your own scoring code; the
metric engine resolves dependencies and runs them per-sample or
aggregated. Sandbox-isolated when
BENCHHUB_SANDBOX_METRICS=1. - Asynchronous processing with Celery (Redis broker).
- Per-user quotas: 50 MB storage, 5 datasets, 50 submissions / 24h by default. Free-tier safe to expose to the open internet.
- API tokens for programmatic uploads (
/settings/api_tokens). - Account deletion (GDPR right-to-be-forgotten) with cascading cleanup.
- Public landing page at
/,/leaderboardsfor browsing the catalog,/u/<id>for public profile pages.
Prerequisites
- Python 3.10+
- Redis (broker + result backend, default port 6379)
Installation
git clone <repository-url>
cd BenchHub
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
Running
Three terminals:
# 1. Redis
redis-server
# 2. Celery worker
celery -A app.celery worker --loglevel=info
# 3. Flask app
python app.py
Then open http://localhost:6060.
Data lives outside the repo at ~/.dtofbenchmarking/ (database + uploads).
Override with BENCHHUB_DATA_DIR=/some/path.
Tests
pytest
429 tests, ~3-4 seconds. Coverage gate is configured in pytest.ini.
Dataset / submission ZIP convention
Folders are auto-detected by prefix:
| Prefix | Type | Files |
|---|---|---|
metric_ |
metric | <sample>.txt containing a float |
hist_ / raw_histogram / hist |
histogram | <sample>.npz (bins, counts) |
raw_ |
depth/map | <sample>_<W>x<H>.npz |
| (anything else) | image / scalar / json / text | by file extension |
git_info.json (or git.info) at the ZIP root attaches commit metadata
to the resulting dataset/submission row.
DLP-safe code uploads
Some networks block .py uploads. The metric editor encodes user code
as BASE64:<...> client-side; the server decodes. Standalone helpers:
scripts/obfuscator.html— portable browser toolscripts/obfuscator_gui.py— Tkinter GUI
Deployment
The production app is self-hosted on a home Ubuntu 24.04 box (RTX 5090, 128 GB RAM, 8 TB) reachable at https://runbenchhub.com. gunicorn + celery
- redis run directly under systemd; nginx + certbot terminate TLS; the
domain is on Cloudflare in DNS-only mode (no proxy) with
ddclientkeeping the A record pointed at the home WAN IP.
Operational runbook: docs/SELFHOST_RUNBOOK.md
— code-push procedure, .env keys, log tailing, DDNS, TLS renewal,
rollback, and the breakages we've already hit.
Fly.io is deprecated: the app was destroyed after the cutover to the home
box. The Fly artifacts (fly.toml, Dockerfile, DEPLOY.md, …) are
archived under archive/fly/ for the case where a future
Fly redeploy needs to be reconstructed.
License
(Choose and add a license file — repository currently has no LICENSE.)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file benchhub_client-0.1.4.tar.gz.
File metadata
- Download URL: benchhub_client-0.1.4.tar.gz
- Upload date:
- Size: 190.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e1a341b9022f55a79a2a08d40dc780220aba82bdf57ab8ff6a8d0f0d5716bad
|
|
| MD5 |
0a49661b6806d71467e4ebcd4b82da1b
|
|
| BLAKE2b-256 |
3ccee9ba94a8455ffa536e0aaac6f6a7c02df0bc3565f17eb3a3e7e3fe57f94e
|
File details
Details for the file benchhub_client-0.1.4-py3-none-any.whl.
File metadata
- Download URL: benchhub_client-0.1.4-py3-none-any.whl
- Upload date:
- Size: 57.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
492ac7bb77d79537d64fbcd8ccc2915776220774601c2752a2b77a8e08494c7e
|
|
| MD5 |
0ddc0790661171be8002cf25da68318c
|
|
| BLAKE2b-256 |
fc63dacb6684474e0b0c6563856ed84776f8d93267fdfa79c3ae19a0c2fee695
|