timeit for multiple functions with better reporting
Project description
Tired of writing the same code again and again when comparing the runtime of more than one function? timethese helps with this type of micro-benchmarking. It basically runs timeit (or actually repeat) on multiple functions and spits out a report.
In one sentence: timethese is timeit for multiple functions with better reporting
Free software: MIT License
Installation
pip install timethese
You can also install the in-development version with:
pip install https://github.com/jwbargsten/python-timethese/archive/master.zip
Usage
Microbenchmark
timethese has a 3 step approach:
define the functions you want to compare
feed them to cmpthese as list or dict (see below)
format the results, aka pretty print
Let’s have a look:
from timethese import cmpthese, pprint_cmp, timethese xs = range(10) # 1. DEFINE FUNCTIONS def map_hex(): list(map(hex, xs)) def list_compr_hex(): list([hex(x) for x in xs]) def map_lambda(): list(map(lambda x: x + 2, xs)) def map_lambda_fn(): fn = lambda x: x + 2 list(map(fn, xs)) def list_compr_nofn(): list([x + 2 for x in xs]) # 2. FEED THE FUNCTIONS TO CMPTHESE # AS DICT: cmp_res_dict = cmpthese( 10000, { "map_hex": map_hex, "list_compr_hex": list_compr_hex, "map_lambda": map_lambda, "map_lambda_fn": map_lambda_fn, "list_compr_nofn": list_compr_nofn, }, repeat=3, ) # OR AS LIST: cmp_res_list = cmpthese( 10000, [map_hex, list_compr_hex, map_lambda, map_lambda_fn, list_compr_nofn,], repeat=3, ) # 3. PRETTY PRINT THE RESULTS print(pprint_cmp(cmp_res_dict)) print(pprint_cmp(cmp_res_list))
What do you get if you run this?
Depending on the runtime of the supplied functions, either rate (unit: 1/s) or the seconds per iteration (s/iter) are shown.
For dict something like:
Rate list_compr_nofn map_hex map_lambda map_lambda_fn list_compr_hex list_compr_nofn 1385057/s . 43% 47% 48% 88% map_hex 969501/s -30% . 3% 4% 31% map_lambda 940257/s -32% -3% . 1% 27% map_lambda_fn 935508/s -32% -4% -1% . 27% list_compr_hex 738367/s -47% -24% -21% -21% .
For list something like:
Rate 4.list_compr_nofn 0.map_hex 2.map_lambda 3.map_lambda_fn 1.list_compr_hex 4.list_compr_nofn 1360009/s . 31% 42% 46% 78% 0.map_hex 1037581/s -24% . 9% 11% 36% 2.map_lambda 955513/s -30% -8% . 2% 25% 3.map_lambda_fn 933666/s -31% -10% -2% . 22% 1.list_compr_hex 763397/s -44% -26% -20% -18% .
(the function names are taken from fn.__name__ and prefixed with the list index.)
Timing
timethese also has the function timethese, which is used by cmpthese internally. To get the timings directly, you can run:
from timethese import timethese xs = range(10) def map_hex(): list(map(hex, xs)) def list_compr_hex(): list([hex(x) for x in xs]) def map_lambda(): list(map(lambda x: x + 2, xs)) def map_lambda_fn(): fn = lambda x: x + 2 list(map(fn, xs)) def list_compr_nofn(): list([x + 2 for x in xs]) timings_dict = timethese( 10000, { "map_hex": map_hex, "list_compr_hex": list_compr_hex, "map_lambda": map_lambda, "map_lambda_fn": map_lambda_fn, "list_compr_nofn": list_compr_nofn, }, repeat=3, ) timings_list = timethese( 10000, [ map_hex, list_compr_hex, map_lambda, map_lambda_fn, list_compr_nofn ], repeat=3, ) # if you want, you can create a pandas df from it import pandas as pd timings_df = pd.DataFrame(timings_dict.values()) print(timings_df) # BEWARE: if you pass a list to timings, you have to skip the .values() call timings_df = pd.DataFrame(timings_list) print(timings_df)
Timing functions with decorators
timethese also provides decorators to time single functions:
import time import timethese @timethese.print_time def calculate_something(): time.sleep(1) calculate_something()
Four decorators are provided, 2 for normal stuff
timethese.print_time
timethese.log_time(logger, level=logging.INFO)
and 2 for pandas dataframes (they also print the shape of the resulting dataframe). Useful when using df.pipe(...)
timethese.log_time_df(logger, level=logging.INFO)
timethese.print_time_df
E.g. to log execution times of pipe operations on pandas dataframes, you could write:
import time import logging import timethese import numpy as np import pandas as pd logging.basicConfig(level=logging.DEBUG) logger = logging.getLogger(__name__) @timethese.log_time_df(logger, logging.DEBUG) def sum_by_group(df): time.sleep(1) # introduce some artificial delay return df.groupby("A").sum() df = pd.DataFrame({"A": np.arange(100) % 2, "B": np.random.normal(size=100)}) res = df.pipe(sum_by_group)
See the function documentation in the source code for better examples.
Development
To run the all tests run:
tox
Note, to combine the coverage data from all the tox environments run:
Windows |
set PYTEST_ADDOPTS=--cov-append tox |
---|---|
Other |
PYTEST_ADDOPTS=--cov-append tox |
See also
The idea came from Perl’s Benchmark.pm, which I used a lot in the Good Ol’ Days.
Changelog
0.0.7 (2020-05-31)
Improved documentation and fixed typos
0.0.6 (2020-05-31)
improved documentation
fixed setup.py install dependencies to again pass travis tests
0.0.5 (2020-05-31)
Added better documentations
Now using NumPy documentation format for function def doc
Fixed typos in pyproject.toml
0.0.4 (2020-05-30)
Fixed code to be compatible with python 3.5
Fixed travis stuff
Added decorators to time specific functions for pandas.DataFrame.pipe arguments
0.0.3 (2020-05-27)
First release on PyPI.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for timethese-0.0.7-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b31df19fec98b1270fe31c172c41d605a76e3ecca47638e2453224774247db5 |
|
MD5 | b7833b439bd0b4a3549b1e2e51fdaf47 |
|
BLAKE2b-256 | 41553fd7524af80127470cb830d301e669e488fa91cd0be4a299597fce6ffb40 |