Skip to main content

Benchmark and analyze functions' time execution and results over the course of development

Project description

Benchmarkit

PyPI version Build Status codecov CodeFactor Dependabot Status

Benchmark and analyze functions' time execution and results over the course of development.

Features

  • No boilerplate code
  • Saves history and additional info
  • Saves function output and parameters to benchmark data science tasks
  • Easy to analyze results
  • Disables garbage collector during benchmarking

Motivation

  • I need to benchmark execution time of my function
  • I don't want to memorize and write boilerplate code
  • I want to compare results with previous runs before some changes were introduced
  • I don't want to manually write down results somewhere
  • I want to know exact commits of my previous runs months ago
  • I want to benchmark accuracy, precision, recall of my models and keep track of hyperparameters

Installation

pip install benchmarkit

Usage

Benchmark execution times

Put @benchmark decorator over function with piece of code that should be timed

from benchmarkit import benchmark, benchmark_run

N = 10000
seq_list = list(range(N))
seq_set = set(range(N))

SAVE_PATH = '/tmp/benchmark_time.jsonl'


@benchmark(num_iters=100, save_params=True, save_output=False)
def search_in_list(num_items=N):
    return num_items - 1 in seq_list


@benchmark(num_iters=100, save_params=True, save_output=False)
def search_in_set(num_items=N):
    return num_items - 1 in seq_set
  • num_iters - how many times to repeat benchmarked function. Default 1
  • save_params - save parameters passed to the benchmarked function in the file with benchmark results. In the example above num_items will be saved. Default False
  • save_output - save benchmarked function output. Should return dict {'name': value}. Default False. See example how to benchmark model results.

Run benchmark:

benchmark_results = benchmark_run(
    [search_in_list, search_in_set],
    SAVE_PATH,
    comment='initial benchmark search',
    rows_limit=10,
    extra_fields=['num_items'],
    metric='mean_time',
    bigger_is_better=False,
)  
  • functions - function or list of functions with benchmark decorator
  • save_file - path to file where to save results
  • comment - comment to save alongside the results
  • rows_limit - limit table rows in console output. Default 10
  • extra_fields - extra fields to include in console output
  • metric - metric which is used for comparison. Default mean_time
  • bigger_is_better - whether bigger value of metric indicates that result is better. For time benchmarks should be False, for model accuracy should be True. Default False

Prints to terminal and returns list of dictionaries with data for the last run.

Benchmark time output1

Change N=1000000 and rerun

Benchmark time output2

The same can be run from command line:

benchmark_run test_data/time/benchmark_functions.py --save_dir /tmp/ --comment "million items" --extra_fields num_items

Benchmark model results

from benchmarkit import benchmark, benchmark_run
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression

MODEL_BENCHMARK_SAVE_FILE = '/tmp/benchmark_model.jsonl'

x, y = load_iris(return_X_y=True)

@benchmark(save_params=True, save_output=True)
def log_regression(C=1.0, fit_intercept=True):
    clf = LogisticRegression(
        random_state=0, 
        solver='lbfgs', 
        multi_class='multinomial', 
        C=C,
        fit_intercept=fit_intercept,
    )
    clf.fit(x, y)
    score = clf.score(x, y)
    return {'score': score}

model_benchmark_results = benchmark_run(
    log_regression,
    MODEL_BENCHMARK_SAVE_FILE,
    comment='baseline model',
    extra_fields=['C', 'fit_intercept'],
    metric='score',
    bigger_is_better=True,
)

Benchmark model1

Change hyperparameter C=0.5 and rerun. Output:

Benchmark model2

The same can be run from command line:

benchmark_run file_with_benchmark.py --save_dir /tmp/ --comment "stronger regularization" --extra_fields C fit_intercept --metric score --bigger_is_better

Analyze results from the file

from benchmarkit import benchmark_analyze

SAVE_PATH = '/tmp/benchmark_time.jsonl'

benchmark_df = benchmark_analyze(
    SAVE_PATH,
    func_name=None, 
    rows_limit=10,
    metric='mean_time',
    bigger_is_better=False,
    extra_fields=['num_items'],
)
  • input_path - path to .jsonl file or directory with .jsonl files with benchmark results
  • func_name - display statistics for particular function. If None then all functions, stored in file, are displayed. Default None
  • rows_limit - limit table rows in console output. Default 10
  • metric - metric which is used for comparison. Default mean_time
  • bigger_is_better - whether bigger value of metric indicates that result is better. For time benchmarks should be False, for model accuracy should be True. Default False
  • extra_fields - extra fields to include in console output

Prints to terminal and returns pandas DataFrame.

Benchmark analyze

The same can be run from command line:

benchmark_analyze /tmp/benchmark_time.jsonl --extra_fields num_items

Other examples

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

benchmarkit-0.0.3.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

benchmarkit-0.0.3-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file benchmarkit-0.0.3.tar.gz.

File metadata

  • Download URL: benchmarkit-0.0.3.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.5

File hashes

Hashes for benchmarkit-0.0.3.tar.gz
Algorithm Hash digest
SHA256 dc769c51869efb923cc35a96cbca16fa6cc7aaababee9112494f49f8f663332d
MD5 f717147888011b29a12171e8babef648
BLAKE2b-256 2704e2c0c1a51a2d71a217a347056d7c68d56027a8b98b70a1857f6cd8abb5a7

See more details on using hashes here.

File details

Details for the file benchmarkit-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: benchmarkit-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.6.5

File hashes

Hashes for benchmarkit-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c56048a178cee300fc3be90a39bb3cbd4654936ccac30f4fd056add34097ef69
MD5 d43d0ab97e19aac9bf50f69350ff3037
BLAKE2b-256 a2df1b5ad6a2606b5131ae5cd07ce29625d5b34ed02cdcf11caede664e430454

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page