No project description provided
Project description
AlgBench: A Python-util to run benchmarks for the empirical evaluation of algorithms.
There are a number of challenges when performing benchmarks for (long-running) algorithms.
- Saving all information requires a lot of boilerplate code and often you forget something.
- If you add some further instances or want to compare an additional parameter, you have to check which data is already available to skip existing entries. Same if you need to interrupt the benchmark.
- Just piping the results into a file can create a huge amount of data, no longer fitting into memory.
- Proper benchmarks often take days or even weeks to run, such that parallelization is necessary (e.g., with slurm) which requires a thread-safe database.
- Many file formats and databases are difficult to access or impossible to repair once corrupted.
AlgBench tries to ease your life by
- saving a lot of the information and data (function arguments, return values, runtime, environment information, stdout, etc.) automatically with a single line of code
- remembering which function arguments have already run and skipping those
- providing a compressible database to save memory, and saving highly redundant information, e.g., of the environment, only once
- providing an NFS-compatible parallel database and compatibility to distribution libraries, such as slurminade
- using a simple format based on JSON and Zip to allow simple parsing and even repairing broken databases by hand
There is a predecessor project, called AeMeasure. AeMeasure made saving the data easy, but required more boilerplate code and reading the data was more difficult and less efficient.
Installation
You can install AlgBench using pip
pip install -U algbench
Usage
There is one important class Benchmark
to run the benchmark, and two important
functions describe
and read_as_pandas
to analyze the results.
class Benchmark(builtins.object) | ||
Benchmark(path: str) -> None This is the heart of the library. It allows to run, save, and load a benchmark. The function `add` will run a configuration, if it is not already in the database. You can also split this into `check` and `run`. This may be advised if you want to distribute the execution. ```python from algbench import Benchmark benchmark = Benchmark("./test_benchmark") def f(x, _test=2, default="default"): print(x) # here you would run your algorithm return {"r1": x, "r2": "test"} benchmark.add(f, 1, _test=None) benchmark.add(f, 2) benchmark.add(f, 3, _test=None) benchmark.compress() for entry in benchmark: print(entry["parameters"], entry["data"]) ``` The following functions are thread-safe: - exists - run - add - insert - front - __iter__ Don't call any of the other functions while the benchmark is running. It could lead to data loss. | ||
Methods defined here:
Data descriptors defined here:
|
You can find an example for graph coloring. The important parts are shown below.
Running a benchmark
from _utils import InstanceDb
from algbench import Benchmark
import networkx as nx
benchmark = Benchmark("03_benchmark_data")
instances = InstanceDb("./01_instances.zip")
def load_instance_and_run(instance_name: str, alg_params):
# load the instance outside the actual measurement
g = instances[instance_name]
def eval_greedy_alg(instance_name: str, alg_params, _instance: nx.Graph):
# arguments starting with `_` are not saved.
coloring = nx.coloring.greedy_coloring.greedy_color(_instance, **alg_params)
return { # the returned values are saved to the database
"num_vertices": _instance.number_of_nodes(),
"num_edges": _instance.number_of_edges(),
"coloring": coloring,
"n_colors": max(coloring.values()) + 1,
}
benchmark.add(eval_greedy_alg, instance_name, alg_params, g)
alg_params_to_evaluate = [
{"strategy": "largest_first", "interchange": True},
{"strategy": "largest_first", "interchange": False},
{"strategy": "random_sequential", "interchange": True},
{"strategy": "random_sequential", "interchange": False},
{"strategy": "smallest_last", "interchange": True},
{"strategy": "smallest_last", "interchange": False},
{"strategy": "independent_set"},
{"strategy": "connected_sequential_bfs", "interchange": True},
{"strategy": "connected_sequential_bfs", "interchange": False},
{"strategy": "connected_sequential_dfs", "interchange": True},
{"strategy": "connected_sequential_dfs", "interchange": False},
{"strategy": "saturation_largest_first"},
]
if __name__ == "__main__":
for instance_name in instances:
print(instance_name)
for conf in alg_params_to_evaluate:
load_instance_and_run(instance_name, conf)
benchmark.compress()
Analyzing the data
from algbench import describe, read_as_pandas, Benchmark
describe("./03_benchmark_data/")
Output
result: | num_vertices: 68 | num_edges: 697 | coloring: || 0: 7 || 1: 8 || 2: 2 || 3: 5 || 4: 3 || 5: 7 || 6: 7 || 7: 6 || 8: 5 || 9: 4 || 10: 5 || 11: 4 || 12: 0 || 13: 6 || 14: 0 || 15: 3 || 16: 5 || 17: 5 || 18: 7 || 19: 0 || ... | n_colors: 9 timestamp: 2023-05-25T21:58:39.201553 runtime: 0.002952098846435547 stdout: stderr: env_fingerprint: 53ad3b5b29d082d7e2bca6881ec9fe35fe441ae1 args_fingerprint: 10ce65b7a61d5ecbfcb1f4e390d72122f7a1f6ec parameters: | func: eval_greedy_alg | args: || instance_name: graph_0 || alg_params: ||| strategy: largest_first ||| interchange: True argv: ['02_run_benchmark.py'] env: | hostname: workstation-r7 | python_version: 3.10.9 (main, Jan 11 2023, 15:21:40) [GCC 11.2.0] | python: /home/krupke/anaconda3/envs/mo310/bin/python3 | cwd: /home/krupke/Repositories/AlgBench/examples/graph_coloring | environment: [{'name': 'virtualenv', 'path': '/home/krupke/.local/lib/python3.10/site-pack... | git_revision: 5357426feb4b49174c313ffa33e2cadf6a83e226 | python_file: /home/krupke/Repositories/AlgBench/examples/graph_coloring/02_run_benchmark.py
# we can also see the raw data of the first entry using `front`
Benchmark("./03_benchmark_data/").front()
Output
{'result': {'num_vertices': 68, 'num_edges': 697, 'coloring': {'0': 7, '1': 8, '2': 2, '3': 5, '4': 3, '5': 7, '6': 7, '7': 6, '8': 5, '9': 4, '10': 5, '11': 4, '12': 0, '13': 6, '14': 0, '15': 3, '16': 5, '17': 5, '18': 7, '19': 0, '20': 2, '21': 3, ...}, 'n_colors': 9}, 'timestamp': '2023-05-25T21:58:39.201553', 'runtime': 0.002952098846435547, 'stdout': '', 'stderr': '', 'env_fingerprint': '53ad3b5b29d082d7e2bca6881ec9fe35fe441ae1', 'args_fingerprint': '10ce65b7a61d5ecbfcb1f4e390d72122f7a1f6ec', 'parameters': {'func': 'eval_greedy_alg', 'args': {'instance_name': 'graph_0', 'alg_params': {'strategy': 'largest_first', 'interchange': True}}}, 'argv': ['02_run_benchmark.py'], 'env': {'hostname': 'workstation-r7', 'python_version': '3.10.9 (main, Jan 11 2023, 15:21:40) [GCC 11.2.0]', 'python': '/home/krupke/anaconda3/envs/mo310/bin/python3', 'cwd': '/home/krupke/Repositories/AlgBench/examples/graph_coloring', 'environment': [{'name': 'virtualenv', 'path': '/home/krupke/.local/lib/python3.10/site-packages', 'version': '20.14.1'}, {'name': 'cfgv', 'path': '/home/krupke/.local/lib/python3.10/site-packages', 'version': '3.3.1'}, ...], 'git_revision': '5357426feb4b49174c313ffa33e2cadf6a83e226', 'python_file': '/home/krupke/Repositories/AlgBench/examples/graph_coloring/02_run_benchmark.py'}}
# we can extract a full pandas tables using `read_as_pandas`
t = read_as_pandas(
"./03_benchmark_data/",
lambda result: {
"instance": result["parameters"]["args"]["instance_name"],
"strategy": result["parameters"]["args"]["alg_params"]["strategy"],
"interchange": result["parameters"]["args"]["alg_params"].get(
"interchange", None
),
"colors": result["result"]["n_colors"],
"runtime": result["runtime"],
"num_vertices": result["result"]["num_vertices"],
"num_edges": result["result"]["num_edges"],
},
)
print(t)
Output
instance strategy interchange colors runtime ...
0 graph_0 largest_first True 9 0.002952
1 graph_0 largest_first False 10 0.000183
2 graph_0 random_sequential True 9 0.003562
3 graph_0 random_sequential False 12 0.000173
4 graph_0 smallest_last True 9 0.003813
... ... ... ... ... ...
5995 graph_499 connected_sequential_bfs True 3 0.000216
5996 graph_499 connected_sequential_bfs False 3 0.000132
5997 graph_499 connected_sequential_dfs True 3 0.000231
5998 graph_499 connected_sequential_dfs False 4 0.000132
5999 graph_499 saturation_largest_first None 3 0.000202
[6000 rows x 7 columns]
Which information is saved?
The following information is saved automatically:
- function name
- all arguments that do not begin with "_"
- the returned values
- runtime
- current date and time
- hostname
- Python version
- Python binary path
- current working directory
- stdout and stderr
- all installed modules and their versions
- git revision
- path of the python file
On doing good empirical evaluations of algorithms
To get a feeling on the interesting instances and parameters, or generally on where to look deeper, you should first perform an explorative study. For such an explorative study, you should select some random parameters and instances, and just look how the numbers look. Iteratively change the parameters and instances, until you know what to evaluate properly. At that point, you can state some research questions and design corresponding workhorse studies to answer them.
Here are some general hints:
- Create a separate folder for every study.
- Add a README.md into each folder that describes the study. At least describe in a sentence, who created this study when in which context.
- Have separated, numerated files for preparing, running, processing, checking, and evaluating the study.
- Extract a simplified pandas table from the database with only the important
data (e.g., stdout or environment information are only necessary for debugging
and don't need to be shared for evaluation). You can save pandas tables as
.json.zip
such that they are small and can simply be added to your Git, even when the full data is too large. - The file for checking the generated data should also describe it.
- Use a separate Jupyter-notebook for each family of plots you want to generate.
- Save the plots into files whose name you can easily trace back to the generating notebook. You will probably copy them later into some paper and half a year later, when you receive the reviews and want to do some changes, you have to find the code that generated them.
Using Git LFS for the data
The data are large binary files. Use Git LFS to add them to your repository more efficiently.
You can find a guide here on how to install Git LFS.
Run
git lfs install
to set up git LFS and
git lfs track "*.zip"
to manage all zips via LFS.
Alternatively, you can also just edit .gitattributes
by hand
*.zip filter=lfs diff=lfs merge=lfs -text
Finally, add .gitattributes
to git via
git add .gitattributes
Version History
- 1.0.0 Changing the database layout, making it more efficient (breaking change!).
- 0.2.0 Changing database slightly to contain meta data and doing more caching. Saving some more information.
- 0.1.3 Fixed bug in arg fingerprint set.
- 0.1.2 Fixed bug with empty rows in pandas table.
- 0.1.1 Fixed bug with
delete_if
. - 0.1.0 First complete version
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.