Skip to main content

Benchmarking tools for LSDB

Project description

lbench - Benchmarking Tool for Python Projects

lbench is a benchmarking tool built on top of pytest and pytest-benchmark, designed to make it easy to write, run, and analyze benchmarks for Python projects. It provides automatic result logging, cProfile profiling and flamegraphs, Dask performance reporting, memory tracking, a Jupyter notebook magic, and a dashboard for visualizing and comparing benchmark results over time.

Template

PyPI GitHub Workflow Status Codecov Read The Docs

Installation

git clone https://github.com/yourusername/lsdb-benchmarking.git
cd lsdb-benchmarking
pip install -e .

Writing Benchmarks

Basic Benchmarks with lbench Fixture

The lbench fixture extends pytest-benchmark with automatic cProfile profiling:

def test_my_function(lbench):
    def benchmark_func():
        result = my_function()

    lbench(benchmark_func)

Add @pytest.mark.lbench_memory to also track peak memory usage with memray:

@pytest.mark.lbench_memory
def test_my_function(lbench):
    lbench(my_function)

Dask Benchmarks with lbench_dask Fixture

def test_my_dask_function(lbench_dask):
    def benchmark_func():
        result = my_dask_dataframe.compute()

    lbench_dask(benchmark_func)

The lbench_dask fixture automatically collects Dask task stream information, generates a Dask performance report, and samples memory usage during execution. Use lbench_dask_collection to also record the Dask graph size and node count:

def test_collection(lbench_dask_collection):
    catalog = lsdb.read_hats(...)
    lbench_dask_collection(catalog)

Running Benchmarks

pytest --lbench benchmarks/

This creates a timestamped result directory, runs all benchmarks, and saves:

  • pytest-benchmark.json — timing stats and extra metrics
  • cprofile_*.prof — cProfile data for each benchmark
  • dask_performance_report_*.html — Dask performance reports (Dask benchmarks only)
  • memray_*.bin — memory tracking data (memory-marked benchmarks only)

Configuring the Result Directory

# Via flag
pytest --lbench --lbench-root=/path/to/results benchmarks/

# Via environment variable
export LBENCH_ROOT=/path/to/results
pytest --lbench benchmarks/

If neither is set, results are saved to ./lbench_results.

Running Benchmarks in Jupyter Notebooks

lbench provides a %%lbench cell magic that produces the same JSON log format as the pytest runner, so notebook results appear alongside pytest results in the dashboard.

Load the extension once per notebook:

%load_ext
lbench.notebook

Then use the cell magic on any cell:

%%lbench
my_expensive_function()

With options:

%%lbench - -rounds
10 - -warmup - -memory - -profile - -name
my_benchmark
my_expensive_function()

Available options:

Option Short Description
--rounds N -r Number of timed rounds (default: 5)
--warmup -w Run one un-timed warmup round first
--memory -m Track peak memory with memray
--profile -p Capture a cProfile .prof file
--dask -d Collect Dask metrics (task stream, memory, performance report)
--collection VAR Also record graph size/length from a Dask collection variable
--name NAME -n Name for this benchmark entry

Dask benchmarks in notebooks

%%lbench - -dask - -rounds
3
my_collection.compute()

# With graph stats from a named variable:
%%lbench - -dask - -collection
src_catalog - -name
catalog_scan
src_catalog.compute()

Results within a notebook session are accumulated into a single timestamped run directory. Call lbench.notebook.magic.reset_session() to start a fresh run directory mid-session.

Viewing Results with the Dashboard

lbench dash
lbench dash --port 8051

Or from a notebook:

from lbench.dashboard.app import run_dashboard

run_dashboard(port=8050)

Calling run_dashboard() again will restart the server on the new settings.

Dashboard Features

Run browser (sidebar)

  • Lists all runs in chronological order
  • Filter runs by date range with the date picker
  • Rename runs with the pencil icon

Benchmark tables

  • Per-benchmark cards showing timing stats, memory usage, and Dask metrics
  • Links to open flamegraphs (cProfile) and Dask performance reports directly in the browser

Trend plots

  • Click "Plot series" to switch to the trend view
  • Select one or more benchmarks and a metric to plot performance over time
  • Error bars show standard deviation where available
  • Respects the active date filter

Example

import pytest


@pytest.mark.parametrize("size", [1000, 10000, 100000])
def test_dataframe_operation(size, lbench):
    import pandas as pd

    df = pd.DataFrame({'A': range(size), 'B': range(size)})

    def benchmark_func():
        result = df['A'] + df['B']

    lbench(benchmark_func)
pytest --lbench tests/test_dataframe_operation.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lf_bench-0.1.0.tar.gz (43.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lf_bench-0.1.0-py3-none-any.whl (34.1 kB view details)

Uploaded Python 3

File details

Details for the file lf_bench-0.1.0.tar.gz.

File metadata

  • Download URL: lf_bench-0.1.0.tar.gz
  • Upload date:
  • Size: 43.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lf_bench-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0e7776a56309803ecca45f7160713ae73eb491802e54afe938a9b6ab98ef4148
MD5 880a4a4fc93c718fe2c6cc00a5fb7076
BLAKE2b-256 b15be09d62fbbc0e4db94fc44bcb1887a95bec9cd545eeb42053b69c837e00f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for lf_bench-0.1.0.tar.gz:

Publisher: publish-to-pypi.yml on lincc-frameworks/lsdb-benchmarking

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file lf_bench-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: lf_bench-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 34.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lf_bench-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b9e9adaa2e44e73b697463beae468ed7efa438f33599694c6279409d97102f9f
MD5 a52dc7bc43713ddfd273e393d94d59df
BLAKE2b-256 7ec220446961b326e059932e27216374cfdcc08ac9716352201e61a1ab44b3bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for lf_bench-0.1.0-py3-none-any.whl:

Publisher: publish-to-pypi.yml on lincc-frameworks/lsdb-benchmarking

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page