Skip to main content

Tools for Running Benchmarks

Project description

Python {benchrun}

{benchrun} is a Python package to run macrobenchmarks, deliberately designed to work well with the larger conbench ecosystem.

Installation

{benchrun} is not [yet] on a package archive like PyPI; you can install from GitHub with

pip install benchrun@git+https://github.com/conbench/conbench.git@main#subdirectory=benchrun/python

Writing benchmarks

Iteration

The code to run for a benchmark is contained in a class inheriting from the abstract Iteration class. At a minimum, users must override the name attribute and run() method (the code to time), but may also override setup(), before_each(), after_each() and teardown() methods, where *_each() runs before/after each iteration, and setup() and teardown() run once before/after all iterations. A simple implementation might look like

import time

from benchrun import Iteration

class MyIteration(Iteration):
    name = "my-iteration"

    def before_each(self, case: dict) -> None:
        # use the `env` dict attribute to pass data between stages
        self.env = {"success": False}

    def run(self, case: dict) -> dict:
        # code to time goes here
        time.sleep(case["sleep_seconds"])
        self.env["success"] = True

    def after_each(self, case: dict) -> None:
        assert run_results["success"]
        self.env = {}

CaseList

An Iteration's methods are parameterized with case, a dict where keys are parameters for the benchmark, and the values are scalar arguments. Cases are managed with an instance of CaseList, a class which takes a params dict, which is like a case dict with the difference that the arguments are lists of valid arguments, not scalars. CaseList will populate a case_list attribute which contains the grid of specified cases to be run:

from benchrun import CaseList

case_list = CaseList(params={"x": [1, 2], "y": ["a", "b", "c"]})
case_list.case_list
#> [{'x': 1, 'y': 'a'},
#>  {'x': 1, 'y': 'b'},
#>  {'x': 1, 'y': 'c'},
#>  {'x': 2, 'y': 'a'},
#>  {'x': 2, 'y': 'b'},
#>  {'x': 2, 'y': 'c'}]

CaseList contains an overridable filter_cases() method that can be used to remove invalid combinations of parameters, e.g. if an x of 2 with a y of b is not viable:

class MyCaseList(CaseList):
    def filter_cases(self, case_list: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
        filtered_case_list = []
        for case in case_list:
            if not (case["x"] == 2 and case["y"] == "b"):
                filtered_case_list.append(case)

        return filtered_case_list

my_case_list = MyCaseList(params={"x": [1, 2], "y": ["a", "b", "c"]})
my_case_list.case_list
#> [{'x': 1, 'y': 'a'},
#>  {'x': 1, 'y': 'b'},
#>  {'x': 1, 'y': 'c'},
#>  {'x': 2, 'y': 'a'},
#>  {'x': 2, 'y': 'c'}]

If there are so many restrictions that it is simpler to specify which cases are viable than which are not, the case_list parameter of filter_cases() can be completely ignored and a manually-generated list can be returned.

Benchmark

A Benchmark in {benchrun} consists of an Iteration instance, a CaseList instance, and potentially a bit more metadata about how to run it like whether to drop disk caches beforehand.

my_benchmark = Benchmark(iteration=my_iteration, case_list=my_case_list)

This class has a run() method to run all cases, or run_case() to run a single case.

BenchmarkList

A BenchmarkList is a lightweight class to tie together all the instances of Benchmark that should be run together (e.g. all the benchmarks for a package).

from benchrun import BenchmarkList

my_benchmark_list = BenchmarkList(benchmarks = [my_benchmark])

The class has a __call__() method that will run all benchmarks in its list, taking care that they all use the same run_id so they will all appear together on conbench.

Running benchmarks and sending results to conbench

BenchmarkList is designed to work seamlessly with {benchadapt}'s CallableAdapter class:

from benchadapt.adapters import CallableAdapter

my_adapter = CallableAdapter(callable=my_benchmark_list)

Like all adapters, it then has a run() method to run all the benchmarks it contains (handling generic metadata appropriately for you), a post_results() method that will send the results to a conbench server, and a __call__() method that will do both. These are the methods that should be called in whatever CI or automated build system will be used for running benchmarks.

Setting more metadata

{benchrun} and {benchadapt} make an effort to handle as much metadata for you as possible (e.g. things like machine info), but you will still need to specify some metadata yourself, e.g. build flags used in compilation or things like run_reason (often something like commit or merge). To see what actually gets sent to conbench, see the documentation for benchadapt.BenchmarkResult.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

benchrun-2023.2.8.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

benchrun-2023.2.8-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file benchrun-2023.2.8.tar.gz.

File metadata

  • Download URL: benchrun-2023.2.8.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for benchrun-2023.2.8.tar.gz
Algorithm Hash digest
SHA256 53c617065af2148c3f8902e632cbe3fb4309c1a1bb32e4632641335f4c6cbc57
MD5 b2228567c2eb87a1f56b9eca03801cbe
BLAKE2b-256 e5002170ba8b96306aefba956a1c6c0b0911c47d9f7e01893b78645f5460e8b6

See more details on using hashes here.

File details

Details for the file benchrun-2023.2.8-py3-none-any.whl.

File metadata

  • Download URL: benchrun-2023.2.8-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.1

File hashes

Hashes for benchrun-2023.2.8-py3-none-any.whl
Algorithm Hash digest
SHA256 1b41ca16b4685823151724bda8e3879e2388e4046f79888b131300e7ae2c7f54
MD5 bdb24e1a87922415b870ddca38f48855
BLAKE2b-256 ac0079505552cbe109948f055d71783b08c530d59b168257791da7e52c555200

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page