Skip to main content

A tool for running TPC-H benchmarks and analyzing results.

Project description

tpch_runner: TPC-H Benchmark Tool

tpch_runner is a database-agnostic TPC-H benchmark tool designed for running and analyzing the TPC-H benchmark across multiple databases. It allows easy setup, execution, and result analysis, make TPC-H testing of their database systems much more efficiently.

Features

  • CLI-driven TPC-H benchmarking:
    • Manage database connections.
    • Generate and load TPC-H test data.
    • Prepare databases (e.g., table creation, optimization, and data reloading).
    • Run individual queries or full TPC-H Powertests.
  • Comprehensive Result Analysis:
    • Manage, validate, and compare benchmark results.
    • Generate charts to visualize Powertest results.
    • Bundle a small dataset for verifying database setup and TPC-H compliance.
  • Multi-database support:
    • Built-in support for MySQL, PostgreSQL, DuckDB, and RapidsDB.
    • Open architecture to easily integrate additional databases.

Installation

Getting started with tpch_runner is quick and simple. Just clone the repository and install it in editable mode:

git clone https://github.com/your-repo/tpch_runner.git
cd tpch_runner
pip install -e .

Important Notes

  • Test Data Generation: tpch_runner supports TPC-H test data generation but does not include dbgen or qgen. You need to manually add the following compiled files to the tpch_runner/tpch/tool directory:
    • dbgen
    • qgen
    • dists.dss
  • Line Delimiters:
    • The official TPC-H dbgen uses |\n line delimiters, which some databases (e.g., PostgreSQL) may not support. You can either remove these delimiters manually or use a TPC-H dbgen variant that avoids them.

Getting Started

To use tpch_runner, simply run the runner CLI tool. Use -h or --help for detailed help on any command:

$ runner -h
Usage: runner [OPTIONS] COMMAND [ARGS]...

Typical Benchmark Workflow

  1. Set up a database connection.
  2. Prepare the TPC-H database (create tables, generate and load data, optimize).
  3. Run individual queries or a full TPC-H Powertest.
  4. Analyze benchmark results.
  5. Compare results across different runs or databases.

Example Commands

  • Add a Database Connection:
$ runner db add -H localhost -t mysql -u root -W -d tpch -a mysql2 -p 3306
Enter database password:
[INFO] Added database connection.
  • Create Tables:
runner db create -a my1
  • Load Data:
runner db load -a duck -m ','
  • Run a Single Query:
runner run query -a duck 15 --no-report
  • Run a TPC-H Powertest
runner run powertest -a duck
  • Result Analysis:
$ runner power list
+------+----------+---------------------+-----------+---------------+---------+
|   ID | DB       | Date                | Success   |   Runtime (s) | Scale   |
|------+----------+---------------------+-----------+---------------+---------|
|    2 | mysql    | 2025-01-19 21:48:23 | True      |        0.0492 | small   |
|   10 | rapidsdb | 2025-01-27 19:44:22 | True      |        5.5694 | small   |
|   17 | duckdb   | 2025-01-30 20:21:50 | True      |        0.8701 | small   |
|   20 | pg       | 2025-01-30 23:53:07 | True      |       14.8139 | 1       |
+------+----------+---------------------+-----------+---------------+---------+
  • Validate Test Results:
runner power validate 18
  • Compare two test results:
runner power compare -s 11 -d 20
  • Generate Comparison Charts:
runner power multi 2 16 18

Result Analysis

tpch_runner provides a variety of ways to analyze and visualize your benchmarking results:

  • Manage Powertest and query results.
  • View test result details.
  • Validate results against known good answers.
  • Compare results from different databases or test runs.
  • Generate line and bar charts for visualizing Powertest performance.
  • Create multi-result comparison charts.

barchart-Powertest

linechart-multi-comparison

Supported Databases

  • MySQL
  • PostgreSQL
  • RapidsDB
  • DuckDB

Integrating additional databases is straightforward by tpch_runner's open architecture.


For more details, refer to the documentation or run runner -h for CLI usage guidance.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tpch_runner-1.0.tar.gz (31.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tpch_runner-1.0-py3-none-any.whl (39.0 kB view details)

Uploaded Python 3

File details

Details for the file tpch_runner-1.0.tar.gz.

File metadata

  • Download URL: tpch_runner-1.0.tar.gz
  • Upload date:
  • Size: 31.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for tpch_runner-1.0.tar.gz
Algorithm Hash digest
SHA256 c8ad97ad6c639d3d926421089fcaa052c7f55def485c00b69f6d29c384885c6a
MD5 3a91a8ac422e2d424f1b3bdbd8f36783
BLAKE2b-256 5d42b20750981825b16dd7f1e869880730e1657948f8c750625251286711f533

See more details on using hashes here.

File details

Details for the file tpch_runner-1.0-py3-none-any.whl.

File metadata

  • Download URL: tpch_runner-1.0-py3-none-any.whl
  • Upload date:
  • Size: 39.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for tpch_runner-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 419ddf323b5f14330b4b1128e5360063aeafa4fb022dd00843a6b338f2c99677
MD5 7135d12e88137dde849e8d9833f35c38
BLAKE2b-256 4bc41ab8eed4d35ca226d18e70fd8db090319309364da38c0123b9af92fd0362

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page