Benchmarking tools for LSDB
Project description
lbench - Benchmarking Tool for Python Projects
lbench is a benchmarking tool built on top of pytest and pytest-benchmark, designed to make it easy to write, run, and analyze benchmarks for Python projects. It provides automatic result logging, cProfile profiling and flamegraphs, Dask performance reporting, memory tracking, a Jupyter notebook magic, and a dashboard for visualizing and comparing benchmark results over time.
Installation
git clone https://github.com/yourusername/lsdb-benchmarking.git
cd lsdb-benchmarking
pip install -e .
Writing Benchmarks
Basic Benchmarks with lbench Fixture
The lbench fixture extends pytest-benchmark with automatic cProfile profiling:
def test_my_function(lbench):
def benchmark_func():
result = my_function()
lbench(benchmark_func)
Add @pytest.mark.lbench_memory to also track peak memory usage with memray:
@pytest.mark.lbench_memory
def test_my_function(lbench):
lbench(my_function)
Dask Benchmarks with lbench_dask Fixture
def test_my_dask_function(lbench_dask):
def benchmark_func():
result = my_dask_dataframe.compute()
lbench_dask(benchmark_func)
The lbench_dask fixture automatically collects Dask task stream information, generates a Dask performance
report, and samples memory usage during execution. Use lbench_dask_collection to also record the Dask graph
size and node count:
def test_collection(lbench_dask_collection):
catalog = lsdb.read_hats(...)
lbench_dask_collection(catalog)
Running Benchmarks
pytest --lbench benchmarks/
This creates a timestamped result directory, runs all benchmarks, and saves:
pytest-benchmark.json— timing stats and extra metricscprofile_*.prof— cProfile data for each benchmarkdask_performance_report_*.html— Dask performance reports (Dask benchmarks only)memray_*.bin— memory tracking data (memory-marked benchmarks only)
Configuring the Result Directory
# Via flag
pytest --lbench --lbench-root=/path/to/results benchmarks/
# Via environment variable
export LBENCH_ROOT=/path/to/results
pytest --lbench benchmarks/
If neither is set, results are saved to ./lbench_results.
Running Benchmarks in Jupyter Notebooks
lbench provides a %%lbench cell magic that produces the same JSON log format as the pytest runner,
so notebook results appear alongside pytest results in the dashboard.
Load the extension once per notebook:
%load_ext
lbench.notebook
Then use the cell magic on any cell:
%%lbench
my_expensive_function()
With options:
%%lbench - -rounds
10 - -warmup - -memory - -profile - -name
my_benchmark
my_expensive_function()
Available options:
| Option | Short | Description |
|---|---|---|
--rounds N |
-r |
Number of timed rounds (default: 5) |
--warmup |
-w |
Run one un-timed warmup round first |
--memory |
-m |
Track peak memory with memray |
--profile |
-p |
Capture a cProfile .prof file |
--dask |
-d |
Collect Dask metrics (task stream, memory, performance report) |
--collection VAR |
Also record graph size/length from a Dask collection variable | |
--name NAME |
-n |
Name for this benchmark entry |
Dask benchmarks in notebooks
%%lbench - -dask - -rounds
3
my_collection.compute()
# With graph stats from a named variable:
%%lbench - -dask - -collection
src_catalog - -name
catalog_scan
src_catalog.compute()
Results within a notebook session are accumulated into a single timestamped run directory. Call
lbench.notebook.magic.reset_session() to start a fresh run directory mid-session.
Viewing Results with the Dashboard
lbench dash
lbench dash --port 8051
Or from a notebook:
from lbench.dashboard.app import run_dashboard
run_dashboard(port=8050)
Calling run_dashboard() again will restart the server on the new settings.
Dashboard Features
Run browser (sidebar)
- Lists all runs in chronological order
- Filter runs by date range with the date picker
- Rename runs with the pencil icon
Benchmark tables
- Per-benchmark cards showing timing stats, memory usage, and Dask metrics
- Links to open flamegraphs (cProfile) and Dask performance reports directly in the browser
Trend plots
- Click "Plot series" to switch to the trend view
- Select one or more benchmarks and a metric to plot performance over time
- Error bars show standard deviation where available
- Respects the active date filter
Example
import pytest
@pytest.mark.parametrize("size", [1000, 10000, 100000])
def test_dataframe_operation(size, lbench):
import pandas as pd
df = pd.DataFrame({'A': range(size), 'B': range(size)})
def benchmark_func():
result = df['A'] + df['B']
lbench(benchmark_func)
pytest --lbench tests/test_dataframe_operation.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lf_bench-0.1.0.tar.gz.
File metadata
- Download URL: lf_bench-0.1.0.tar.gz
- Upload date:
- Size: 43.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e7776a56309803ecca45f7160713ae73eb491802e54afe938a9b6ab98ef4148
|
|
| MD5 |
880a4a4fc93c718fe2c6cc00a5fb7076
|
|
| BLAKE2b-256 |
b15be09d62fbbc0e4db94fc44bcb1887a95bec9cd545eeb42053b69c837e00f5
|
Provenance
The following attestation bundles were made for lf_bench-0.1.0.tar.gz:
Publisher:
publish-to-pypi.yml on lincc-frameworks/lsdb-benchmarking
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lf_bench-0.1.0.tar.gz -
Subject digest:
0e7776a56309803ecca45f7160713ae73eb491802e54afe938a9b6ab98ef4148 - Sigstore transparency entry: 1405236721
- Sigstore integration time:
-
Permalink:
lincc-frameworks/lsdb-benchmarking@382d8102d79f4fabbdeff39173861d2bb1c221c3 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/lincc-frameworks
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@382d8102d79f4fabbdeff39173861d2bb1c221c3 -
Trigger Event:
release
-
Statement type:
File details
Details for the file lf_bench-0.1.0-py3-none-any.whl.
File metadata
- Download URL: lf_bench-0.1.0-py3-none-any.whl
- Upload date:
- Size: 34.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9e9adaa2e44e73b697463beae468ed7efa438f33599694c6279409d97102f9f
|
|
| MD5 |
a52dc7bc43713ddfd273e393d94d59df
|
|
| BLAKE2b-256 |
7ec220446961b326e059932e27216374cfdcc08ac9716352201e61a1ab44b3bd
|
Provenance
The following attestation bundles were made for lf_bench-0.1.0-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on lincc-frameworks/lsdb-benchmarking
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lf_bench-0.1.0-py3-none-any.whl -
Subject digest:
b9e9adaa2e44e73b697463beae468ed7efa438f33599694c6279409d97102f9f - Sigstore transparency entry: 1405236728
- Sigstore integration time:
-
Permalink:
lincc-frameworks/lsdb-benchmarking@382d8102d79f4fabbdeff39173861d2bb1c221c3 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/lincc-frameworks
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@382d8102d79f4fabbdeff39173861d2bb1c221c3 -
Trigger Event:
release
-
Statement type: