Skip to main content

A tool for running TPC-H benchmarks and analyzing results.

Project description

tpch_runner: TPC-H Benchmark Tool

tpch_runner is a database-agnostic TPC-H benchmark tool designed for running and analyzing the TPC-H benchmark across multiple databases. It allows easy setup, execution, and result analysis, make TPC-H testing of their database systems much more efficiently.

Features

  • CLI-driven TPC-H benchmarking:
    • Manage database connections.
    • Generate and load TPC-H test data.
    • Prepare databases (e.g., table creation, optimization, and data reloading).
    • Run individual queries or full TPC-H Powertests.
  • Comprehensive Result Analysis:
    • Manage, validate, and compare benchmark results.
    • Generate charts to visualize Powertest results.
    • Bundle a small dataset for verifying database setup and TPC-H compliance.
  • Multi-database support:
    • Built-in support for MySQL, PostgreSQL, DuckDB, and RapidsDB.
    • Open architecture to easily integrate additional databases.

Installation

Getting started with tpch_runner is quick and simple. Just clone the repository and install it in editable mode:

# install by clone from github repository
git clone https://github.com/your-repo/tpch_runner.git
cd tpch_runner
pip install -e .

# install from Pypi package
pip install tpch_runner

Important Notes

  • Test Data Generation: tpch_runner supports TPC-H test data generation but does not include dbgen or qgen. You need to manually add the following compiled files to the tpch_runner/tpch/tool directory:
    • dbgen
    • qgen
    • dists.dss
  • Line Delimiters:
    • The official TPC-H dbgen uses |\n line delimiters, which some databases (e.g., PostgreSQL) may not support. You can either remove these delimiters manually or use a TPC-H dbgen variant that avoids them.

Getting Started

To use tpch_runner, simply run the runner CLI tool. Use -h or --help for detailed help on any command:

$ runner -h
Usage: runner [OPTIONS] COMMAND [ARGS]...

Typical Benchmark Workflow

  1. Set up a database connection.
  2. Prepare the TPC-H database (create tables, generate and load data, optimize).
  3. Run individual queries or a full TPC-H Powertest.
  4. Analyze benchmark results.
  5. Compare results across different runs or databases.

Example Commands

  • Add a Database Connection:
$ runner db add -H localhost -t mysql -u root -W -d tpch -a mysql2 -p 3306
Enter database password:
[INFO] Added database connection.
  • Create Tables:
runner db create -a my1
  • Load Data:
runner db load -a duck -m ','
  • Run a Single Query:
runner run query -a duck 15 --no-report
  • Run a TPC-H Powertest
runner run powertest -a duck
  • Result Analysis:
$ runner power list
+------+----------+---------------------+-----------+---------------+---------+
|   ID | DB       | Date                | Success   |   Runtime (s) | Scale   |
|------+----------+---------------------+-----------+---------------+---------|
|    2 | mysql    | 2025-01-19 21:48:23 | True      |        0.0492 | small   |
|   10 | rapidsdb | 2025-01-27 19:44:22 | True      |        5.5694 | small   |
|   17 | duckdb   | 2025-01-30 20:21:50 | True      |        0.8701 | small   |
|   20 | pg       | 2025-01-30 23:53:07 | True      |       14.8139 | 1       |
+------+----------+---------------------+-----------+---------------+---------+
  • Validate Test Results:
runner power validate 18
  • Compare two test results:
runner power compare -s 11 -d 20
  • Generate Comparison Charts:
runner power multi 2 16 18

Result Analysis

tpch_runner provides a variety of ways to analyze and visualize your benchmarking results:

  • Manage Powertest and query results.
  • View test result details.
  • Validate results against known good answers.
  • Compare results from different databases or test runs.
  • Generate line and bar charts for visualizing Powertest performance.
  • Create multi-result comparison charts.

barchart-Powertest

linechart-multi-comparison

Supported Databases

  • MySQL
  • PostgreSQL
  • RapidsDB
  • DuckDB

Integrating additional databases is straightforward by tpch_runner's open architecture.


For more details, refer to the documentation or run runner -h for CLI usage guidance.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tpch_runner-1.0.1.tar.gz (31.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tpch_runner-1.0.1-py3-none-any.whl (39.1 kB view details)

Uploaded Python 3

File details

Details for the file tpch_runner-1.0.1.tar.gz.

File metadata

  • Download URL: tpch_runner-1.0.1.tar.gz
  • Upload date:
  • Size: 31.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for tpch_runner-1.0.1.tar.gz
Algorithm Hash digest
SHA256 bdaca5ca4fb5018878236c0ba7fbb3bd2c50869a3b3c16f3b4c2bc8135480275
MD5 beff9e025617d78933e2ce9a8f6b992f
BLAKE2b-256 38593a17bb122cc2041741f270650095b5c0756657f11faa41b69dd40d80b98e

See more details on using hashes here.

File details

Details for the file tpch_runner-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: tpch_runner-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 39.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for tpch_runner-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f3cb44532ad11f859f0d3f3af8ad298b8ff9de79fee49c651c6ecc56b9df7172
MD5 e4f319ffb6bc501e9488b5835d1984a4
BLAKE2b-256 ac64cd9942ef577c2e96264a98c8033952aa92affbb639bf4e489ea262eccc93

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page