Scenario-based polyglot database benchmark platform
Project description
BenchForge
Research-grade, scenario-based database benchmark platform for DB researchers and engineers.
Compare different DB access stacks — driver, ORM, language — under identical workloads with statistical rigor suitable for academic publication (VLDB, SIGMOD, OSDI) and professional engineering evaluation.
Why BenchForge?
Most database benchmark scripts are one-off, ad-hoc, and produce results that cannot be reproduced or trusted. BenchForge addresses this by providing:
- Statistical rigor — Multi-iteration experiments with bootstrap confidence intervals, not single-run "eyeball" comparisons
- Reproducibility — Seed control, full environment capture, setup/teardown isolation, and versioned result schemas
- Publication quality — Reports designed for academic papers: ECDF plots, CI error bars, booktabs tables, colorblind-safe palette
- Apples-to-apples comparison — Run the exact same workload across different drivers, ORMs, or languages with
bench compare
BenchForge is not a distributed load generator, a database provisioning tool, or a replacement for TPC benchmarks. It is a focused tool for comparing database access stacks under controlled conditions.
Key Features
- Multi-iteration experiments with seed control for reproducibility
- HDR histogram (in-house, zero-dep) for O(1) latency recording with configurable precision
- Cross-iteration statistics: mean, stdev, CV, 95% CI (bootstrap)
- Time-series collection in 1-second windows: throughput, errors, latency quantiles
- Publication-quality HTML reports: paper theme (Crimson Pro + Source Sans 3, booktabs tables), ECDF plots, CI error bars, time-series charts, Okabe-Ito colorblind-safe palette
- Environment capture: CPU, memory, OS, Python version, DB server config
- Setup/teardown queries per iteration for run isolation
- Warmup phase excluded from measurement
Installation
From PyPI (recommended)
pip install benchforge
From source (development)
git clone https://github.com/yeongseon/benchforge.git
cd benchflow
pip install -e ".[dev]"
With pipx (isolated install)
pipx install benchforge
Dependencies
BenchForge requires Python 3.10+ and includes the following dependencies:
| Package | Purpose |
|---|---|
psycopg[binary] |
PostgreSQL driver (psycopg3) |
sqlalchemy |
SQLAlchemy Core/ORM driver |
pydantic |
Scenario schema validation |
pyyaml |
YAML scenario loading |
typer + rich |
CLI interface |
jinja2 + plotly |
HTML report generation |
numpy |
Bootstrap CI computation |
Quick Start
# 1. Start PostgreSQL
docker compose up -d
# 2. Install BenchForge
pip install -e ".[dev]"
# 3. Run a benchmark (5 iterations, seed=42)
bench run scenarios/basic.yaml -v
# 4. Override iterations/seed from CLI
bench run scenarios/basic.yaml -n 10 --seed 123
# 5. Compare two runs
bench compare reports/run1.json reports/run2.json
# 6. Generate HTML report
bench report reports/run1.json
For a detailed walkthrough, see docs/quickstart.md.
Example Scenarios
BenchForge ships with ready-to-use example scenarios in examples/:
| Scenario | File | Description |
|---|---|---|
| OLTP Point Lookups | oltp_point_lookups.yaml |
Single-row SELECT by PK — measures point-query latency and driver overhead |
| Analytical Aggregation | analytical_aggregation.yaml |
GROUP BY over 500K rows — full-table scans, aggregation, OLAP-style queries |
| Connection Pool Stress | connection_pool_stress.yaml |
32-worker concurrency stress — connection overhead and latency degradation |
| Mixed Read/Write | mixed_read_write.yaml |
Banking-style OLTP — interleaved SELECTs, UPDATEs, and INSERTs |
| Index vs Seq Scan | index_scan_vs_seq_scan.yaml |
Selectivity impact on query planner — index scan vs sequential scan paths |
Run any example:
bench run examples/oltp_point_lookups.yaml -v
bench run examples/mixed_read_write.yaml -n 3 --seed 7
Scenario Format
name: basic-select
description: "Basic point SELECT benchmark: psycopg vs SQLAlchemy"
setup:
queries:
- "CREATE TABLE IF NOT EXISTS users (id SERIAL PRIMARY KEY, name VARCHAR(100))"
- "INSERT INTO users (name) SELECT 'user_' || i FROM generate_series(1, 1000) AS i ON CONFLICT DO NOTHING"
teardown:
queries:
- "TRUNCATE TABLE users"
steps:
- name: point-select
query: "SELECT * FROM users WHERE id = %(id)s"
params:
id: "random_int(1, 1000)"
load:
concurrency: 4
duration: 10
warmup:
duration: 3
experiment:
iterations: 5
seed: 42
pause_between: 2.0
targets:
- name: psycopg-raw
stack_id: python+psycopg
driver: psycopg
dsn: "postgresql://postgres:postgres@localhost:5432/benchflow"
- name: sqlalchemy-core
stack_id: python+sqlalchemy
driver: sqlalchemy
dsn: "postgresql+psycopg://postgres:postgres@localhost:5432/benchflow"
For the complete DSL specification, see docs/scenario-reference.md.
Architecture
Controller (Python Core)
+-- Scenario Engine YAML DSL -> Pydantic models + ExperimentConfig
+-- Threaded Runner barrier-sync, perf_counter_ns, GC control, multi-iteration
+-- HDR Histogram O(1) record, log-bucket, mergeable across threads
+-- Metrics Aggregator histogram percentiles, bootstrap CI, cross-iteration stats
+-- Report Generator publication-quality HTML (paper + dark themes)
Workers (per-thread lifecycle)
+-- PsycopgWorker raw psycopg3, one connection per thread
+-- SQLAlchemyWorker SQLAlchemy Core, shared engine, param translation
For a detailed architecture walkthrough, see docs/architecture.md.
Project Structure
benchflow/
benchflow/
core/
runner/runner.py # Multi-iteration threaded benchmark execution
scenario/schema.py # Pydantic scenario models + ExperimentConfig
scenario/loader.py # YAML loading
metrics/aggregator.py # Latency stats, bootstrap CI, cross-iteration aggregation
metrics/histogram.py # HDR-style log-bucket histogram
report/html.py # Publication-quality HTML report generator
result.py # Versioned result JSON schema (v2)
cli/main.py # Typer CLI (run/compare/report)
workers/
protocol.py # Worker ABC + registry
python/
psycopg_worker.py
sqlalchemy_worker.py
scenarios/basic.yaml
examples/ # Ready-to-use benchmark scenarios
docs/ # Comprehensive documentation
tests/
CLI Reference
bench run <scenario.yaml> [OPTIONS]
-o, --output Output JSON path
-n, --iterations Override iteration count
--seed Override random seed
--capture-db-info Capture DB server config via introspect()
-v, --verbose Enable verbose logging
bench compare <baseline.json> <contender.json> [OPTIONS]
-o, --output Output comparison JSON
bench report <result.json> [OPTIONS]
-o, --output Output HTML path
CLI Stability Note: The
bench run,bench compare, andbench reportcommands are considered stable as of v0.1.0. Subcommand names and core flags (-o,-n,--seed,-v) will follow semantic versioning — breaking changes only in major versions.
Documentation
| Document | Description |
|---|---|
| Quick Start | Install, run, report, compare — step by step |
| Concepts | Scenarios, steps, targets, workers, iterations, result schema |
| Methodology | Clock sources, HDR histograms, bootstrap CI, time-series |
| Reproducibility | Pre/during/post benchmark checklists, pitfalls |
| Scenario Reference | Complete DSL specification with every field documented |
| Architecture | System overview, components, execution flow, extension points |
Contributing
See CONTRIBUTING.md for development setup, testing, code style, and guidelines for adding scenarios and workers.
Citing BenchForge
If you use BenchForge in your research, please cite it:
@software{choe2026benchflow,
title = {BenchForge: Research-Grade Database Benchmark Platform},
author = {Choe, Yeongseon},
year = {2026},
url = {https://github.com/yeongseon/benchforge},
}
See CITATION.cff for machine-readable citation metadata.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file benchforge-0.1.0.tar.gz.
File metadata
- Download URL: benchforge-0.1.0.tar.gz
- Upload date:
- Size: 48.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
015f168961dc4df9e192dd92f81cb283fdd81e6a3165d8e2c16d407283d18e10
|
|
| MD5 |
575441afc938000a95aacd2b268fe9b2
|
|
| BLAKE2b-256 |
507caa5b50846c57856ba42971ffbb11684dc72ad4cd082e378b67764fd19cf4
|
Provenance
The following attestation bundles were made for benchforge-0.1.0.tar.gz:
Publisher:
publish-pypi.yml on yeongseon/benchforge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
benchforge-0.1.0.tar.gz -
Subject digest:
015f168961dc4df9e192dd92f81cb283fdd81e6a3165d8e2c16d407283d18e10 - Sigstore transparency entry: 1155099780
- Sigstore integration time:
-
Permalink:
yeongseon/benchforge@d158e4565a8c1873916251d608b1828c20d753cf -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/yeongseon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@d158e4565a8c1873916251d608b1828c20d753cf -
Trigger Event:
push
-
Statement type:
File details
Details for the file benchforge-0.1.0-py3-none-any.whl.
File metadata
- Download URL: benchforge-0.1.0-py3-none-any.whl
- Upload date:
- Size: 44.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b00fce3b8f27fcdd0f75dc4ba3101ce59d6c8a40b372e8f0e321706efffea6a
|
|
| MD5 |
a270799be75cc8e83013e155ad81c659
|
|
| BLAKE2b-256 |
f8bcb28640440b8f340f6c49f67b2b24d7b0ba42a15850e94bf2a53b45f8c5de
|
Provenance
The following attestation bundles were made for benchforge-0.1.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on yeongseon/benchforge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
benchforge-0.1.0-py3-none-any.whl -
Subject digest:
6b00fce3b8f27fcdd0f75dc4ba3101ce59d6c8a40b372e8f0e321706efffea6a - Sigstore transparency entry: 1155099781
- Sigstore integration time:
-
Permalink:
yeongseon/benchforge@d158e4565a8c1873916251d608b1828c20d753cf -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/yeongseon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@d158e4565a8c1873916251d608b1828c20d753cf -
Trigger Event:
push
-
Statement type: