Skip to main content

Rewrite chained expressions on xarray datasets to improve performance

Project description

XREXPR: Xarray Expression Rewriter

Imagine you have an xarray dataset that you want to do some analysis on. You might write something like this:

%%timeit 
ds.mean(dim="lat").mean(dim="lon").isel(time=0).compute()

193 ms ± 49.6 ms per loop (mean ± std. dev. of 5 runs, 5 loops each)

However, it would be a lot faster if you instead wrote:

ds.isel(time=0).mean(dim="lat").mean(dim="lon").compute()

925 μs ± 401 μs per loop (mean ± std. dev. of 5 runs, 5 loops each)

In this instance, just reordering the operations makes a ~200x performance difference. We can see that these two expressions are equivalent, but unfortunately, xarray can't automatically reorder them for us (yet?).

from xarray.testing import assert_equal
assert_equal(
    ds.isel(time=0).mean(dim="lat").mean(dim="lon"),
    ds.mean(dim="lat").mean(dim="lon").isel(time=0),
)

# Does not raise an AssertionError

That's where xrexpr comes in. It takes a function of the form

def func(ds: xr.Dataset) -> xr.Dataset:
    return ds.operation1().operation2()...

and reorders the operations (hopefully safely 🤞) to optimize the performance of the expression.

>>> from xrexpr import peek_rewritten_expr, rewrite_expr

>>> def slow_func(ds: xr.Dataset) -> xr.Dataset:
        return ds.mean(dim="lat").mean(dim="lon").isel(time=0)

>>> peek_rewritten_expr(func)
"""
def func(ds: xr.Dataset) -> xr.Dataset:
    return ds.isel(time=0).mean(dim="lat").mean(dim="lon")
"""
%%timeit
func(ds)

925 μs ± 401 μs per loop (mean ± std. dev. of 5 runs, 5 loops each)

%%timeit
rewritten_func = rewrite_expr(slow_func)
rewritten_func(ds)

2.43 ms ± 546 μs per loop (mean ± std. dev. of 5 runs, 5 loops each)

(Note that in the above example, we are also timing the rewriting process itself. We could do that separately once, in which case the performance would be even better - aroun the 900µs for the fast case.)

rewritten_func(ds)

795 μs ± 299 μs per loop (mean ± std. dev. of 5 runs, 5 loops each)

That's it! Now you can use func as you normally would, and it will automatically reorder the operations for you to optimize performance.


This package is just making it's way out of the proof of concept stage, so expect some issues. It is also unlikely to support the full range of xarray operations for some time. If it doesn't do anything for you, please open an issue!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xrexpr-0.0.1.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xrexpr-0.0.1-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file xrexpr-0.0.1.tar.gz.

File metadata

  • Download URL: xrexpr-0.0.1.tar.gz
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for xrexpr-0.0.1.tar.gz
Algorithm Hash digest
SHA256 1936601428ef1ebcf89c0054b18ea5096f2c0fe2664fe6e7fa97040368825fe8
MD5 dcca46938cf589a591a20bddba12f0fe
BLAKE2b-256 529db16e9eb6d25cbc471a975f671e05b42903c9b13a73bf17f87c837709e801

See more details on using hashes here.

Provenance

The following attestation bundles were made for xrexpr-0.0.1.tar.gz:

Publisher: cd.yml on charles-turner-1/xrexpr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file xrexpr-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: xrexpr-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for xrexpr-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f4d6f4af63bebe493bcdb051929b680621c4d0dfb3a59c1ed4c0622d0330a187
MD5 d264aa0eaa91046674cc1e56a76e025d
BLAKE2b-256 a310d8c8ea380f50d026245ab70f412c1f1aee30bdb8a7480050cd62e57b829b

See more details on using hashes here.

Provenance

The following attestation bundles were made for xrexpr-0.0.1-py3-none-any.whl:

Publisher: cd.yml on charles-turner-1/xrexpr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page