Skip to main content

Efficient repeated evaluation of splines at fixed points.

Project description

Python Unit Testing using Conda codecov

cached_interpolate

Efficient evaluation of interpolants at fixed points.

When performing a Bayesian analysis with a stochastic sampler however, one sometimes has to evaluate many interpolants (representing the parameterized model) at the same set of points (the data) with the same knot points.

Evaluating interpolants typically requires two stages:

  1. finding the closest knot of the interpolant to the new point and the distance from that knot.
  2. evaluating the interpolant at that point.

When we have identical knots and evaluation points but different functions being approximated the first of these stages is done many times unnecessarily. This can be made more efficient by caching the locations of the evaluation points leaving just the evaluation of the interpolation coefficients to be done at each iteration.

A further advantage of this, is that it allows trivially parallising the interpolation using JAX or cupy.

This package implements this caching for nearest neighbour, linear, and cubic interpolation.

Installation

Currently this is only installable by downloading the source and installing locally.

$ git clone git@github.com:ColmTalbot/cached_interpolation.git
$ cd cached_interpolation
$ pip install .

Demonstration

We can compare the interpolation to the scipy.interpolate.CubicSpline implementation.

import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import CubicSpline

from cached_interpolate import CachingInterpolant

x_nodes = np.linspace(0, 1, 10)
y_nodes = np.random.uniform(-1, 1, 10)
evaluation_points = np.sort(np.random.uniform(0, 1, 10000))

interpolant = CachingInterpolant(x=x_nodes, y=y_nodes, kind="cubic")
sp_interpolant = CubicSpline(x=x_nodes, y=y_nodes, bc_type="natural")

figure, axes = plt.subplots(nrows=2, gridspec_kw={'height_ratios': [3, 1]}, sharex=True)
axes[0].plot(evaluation_points, interpolant(evaluation_points))
axes[0].plot(evaluation_points, sp_interpolant(evaluation_points))
axes[0].scatter(x_nodes, y_nodes, color="k")
axes[1].plot(evaluation_points, interpolant(evaluation_points) - sp_interpolant(evaluation_points))
axes[0].set_ylabel("$y$")
axes[1].set_xlabel("$x$")
axes[1].set_ylabel("$\\Delta y$")
axes[1].set_xlim(0, 1)
plt.tight_layout()
plt.show()
plt.close(figure)

comparison

I note here that we use the "natural" boundary condition. This means that first and second derivatives of the spline vanish at the endpoints. This is different from the default "not-a-knot" boundary condition used in scipy.

We can now evaluate this interpolant in a loop to demonstrate the performance of the caching. At every iteration we change the function being interpolated by randomly setting the y argument. Before the loop, we add an initial call to set up the cache as this will be much slower than the following iterations.

import time

import numpy as np
from scipy.interpolate import CubicSpline

from cached_interpolate import CachingInterpolant

x_nodes = np.linspace(0, 1, 10)
y_nodes = np.random.uniform(-1, 1, 10)
evaluation_points = np.random.uniform(0, 1, 10000)

interpolant = CachingInterpolant(x=x_nodes, y=y_nodes, kind="cubic")
sp_interpolant = CubicSpline(x=x_nodes, y=y_nodes, bc_type="natural")

interpolant(x=evaluation_points, y=y_nodes)

n_iterations = 1000

start = time.time()
for _ in range(n_iterations):
    y_nodes = np.random.uniform(-1, 1, 10)
    interpolant(x=evaluation_points, y=y_nodes)
stop = time.time()
print(f"Cached time = {(stop - start):.3f}s for {n_iterations} iterations.")

start = time.time()
for _ in range(n_iterations):
    y_nodes = np.random.uniform(-1, 1, 10)
    CubicSpline(x=x_nodes, y=y_nodes, bc_type="natural")(evaluation_points)
stop = time.time()
print(f"Scipy time = {(stop - start):.3f}s for {n_iterations} iterations.")
Cached time = 0.187s for 1000 iterations.
Scipy time = 0.450s for 1000 iterations.

We've gained a factor of ~2.5 over the scipy version without caching. If this is the dominant cost in a simulation that takes a week to run, this is a big improvement.

If we need to evaluate for a new set of points, we have to tell the interpolant to reset the cache. There are two ways to do this:

  • create a new interpolant, this will require reevaluating the interplation coefficients.
  • disable the evaluation point caching.
import numpy as np
from scipy.interpolant import CubicSpline

from cached_interpolate import CachingInterpolant

x_nodes = np.linspace(0, 1, 10)
y_nodes = np.random.uniform(-1, 1, 10)
evaluation_points = np.random.uniform(0, 1, 10000)

interpolant = CachingInterpolant(x=x_nodes, y=y_nodes, kind="cubic")
interpolated_values = interpolant(evaluation_points)

new_evaluation_points = np.random.uniform(0, 1, 1000)
interpolant(x=new_evaluation_points, use_cache=False)

Using the code in this way is much slower than scipy and so not practically very useful.

If you have access to an Nvidia GPU and are evaluating the spline at ~ O(10^5) or more points you may want to switch to the cupy backend. This uses cupy just for the evaluation stage, not for computing the interpolation coefficients.

import cupy as cp
import numpy as np

from cached_interpolate import CachingInterpolant

x_nodes = np.linspace(0, 1, 10)
y_nodes = np.random.uniform(-1, 1, 10)
evaluation_points = np.random.uniform(0, 1, 10000)
evaluation_points = cp.asarray(evaluation_points)

interpolant = CachingInterpolant(x=x_nodes, y=y_nodes, backend=cp)
interpolated_values = interpolant(evaluation_points)

We can now repeat our timing test. To make the use of a GPU more realistic we'll increase the number of evaluation points.

import time

import cupy as cp
import numpy as np
from scipy.interpolate import CubicSpline

from cached_interpolate import CachingInterpolant

x_nodes = np.linspace(0, 1, 10)
y_nodes = np.random.uniform(-1, 1, 10)
evaluation_points = np.random.uniform(0, 1, 100000)
cp_evaluation_points = cp.asarray(evaluation_points)

interpolant = CachingInterpolant(x=x_nodes, y=y_nodes, kind="cubic")
cp_interpolant = CachingInterpolant(x=x_nodes, y=y_nodes, kind="cubic", backend=cp)
sp_interpolant = CubicSpline(x=x_nodes, y=y_nodes, bc_type="natural")

interpolant(x=evaluation_points)
cp_interpolant(x=cp_evaluation_points)

n_iterations = 1000

start = time.time()
for _ in range(n_iterations):
    y_nodes = np.random.uniform(-1, 1, 10)
    interpolant(x=evaluation_points, y=np.random.uniform(-1, 1, 10))
stop = time.time()
print(f"CPU cached time = {(stop - start):.3f}s for {n_iterations} iterations.")

start = time.time()
for _ in range(n_iterations):
    y_nodes = np.random.uniform(-1, 1, 10)
    CubicSpline(x=x_nodes, y=y_nodes, bc_type="natural")(evaluation_points)
stop = time.time()
print(f"Scipy time = {(stop - start):.3f}s for {n_iterations} iterations.")

start = time.time()
for _ in range(n_iterations):
    y_nodes = np.random.uniform(-1, 1, 10)
    cp_interpolant(x=cp_evaluation_points, y=np.random.uniform(-1, 1, 10))
stop = time.time()
print(f"GPU cached time = {(stop - start):.3f}s for {n_iterations} iterations.")
CPU cached time = 3.367s for 1000 iterations.
Scipy time = 4.213s for 1000 iterations.
GPU cached time = 0.212s for 1000 iterations.

While there are likely more optimizations that can be made and improved flexibility in the implementation, we can see that the GPU version is well over an order of magnitude faster than either of the CPU versions.

If you have any comments/questions feel free to contact me through the issue tracker or a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cached_interpolate-0.3.2.tar.gz (223.1 kB view details)

Uploaded Source

Built Distributions

cached_interpolate-0.3.2-cp312-cp312-win_amd64.whl (273.1 kB view details)

Uploaded CPython 3.12 Windows x86-64

cached_interpolate-0.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (780.0 kB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

cached_interpolate-0.3.2-cp312-cp312-macosx_10_13_universal2.whl (377.1 kB view details)

Uploaded CPython 3.12 macOS 10.13+ universal2 (ARM64, x86-64)

cached_interpolate-0.3.2-cp311-cp311-win_amd64.whl (273.1 kB view details)

Uploaded CPython 3.11 Windows x86-64

cached_interpolate-0.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (782.9 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

cached_interpolate-0.3.2-cp311-cp311-macosx_10_9_universal2.whl (374.4 kB view details)

Uploaded CPython 3.11 macOS 10.9+ universal2 (ARM64, x86-64)

cached_interpolate-0.3.2-cp310-cp310-win_amd64.whl (273.0 kB view details)

Uploaded CPython 3.10 Windows x86-64

cached_interpolate-0.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (745.5 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

cached_interpolate-0.3.2-cp310-cp310-macosx_10_9_universal2.whl (374.2 kB view details)

Uploaded CPython 3.10 macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file cached_interpolate-0.3.2.tar.gz.

File metadata

  • Download URL: cached_interpolate-0.3.2.tar.gz
  • Upload date:
  • Size: 223.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for cached_interpolate-0.3.2.tar.gz
Algorithm Hash digest
SHA256 e03d06621554a9aa2b5d2be2c061856e6eb2eafb1d437ec408c198e43d027db1
MD5 d568ab18106521a09995b190d32374be
BLAKE2b-256 2feb680b36dc0bd8519e4cb1b9b5a391228b80484fba50197ad9d837d878ee0e

See more details on using hashes here.

File details

Details for the file cached_interpolate-0.3.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for cached_interpolate-0.3.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 820b356e38bb6b02199a0d0b8133d2a067351026209589a4fe1a6825565168a5
MD5 38d0a144de8bebc3fb3d452c3f3b2272
BLAKE2b-256 c86d8acc3b48cfb1275e2afdf9193955486b531d1f34ecd97c6b02c43d47c7ca

See more details on using hashes here.

File details

Details for the file cached_interpolate-0.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for cached_interpolate-0.3.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7aca455d8e01e2cc2a2e2993e839b0e9ad17b92acaf7f2895a3e8f671da498f4
MD5 62e95994c964a3be7b50ffad89120726
BLAKE2b-256 72101cd0a9f19ea627f96b4f3d1e6b911d0fe2d33e5fcd5e9b099de71675743c

See more details on using hashes here.

File details

Details for the file cached_interpolate-0.3.2-cp312-cp312-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for cached_interpolate-0.3.2-cp312-cp312-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 a6dbb6e6b0c1d33d45e2927de49abddb9527596a17132c483f213e36a6ea7c6b
MD5 f2cfe9571af964cd79c22b6e343c37bd
BLAKE2b-256 835def3eb1eb9dfe944f3309e38d2cded101010a1613e7dc2c82ba1a94d8f5c3

See more details on using hashes here.

File details

Details for the file cached_interpolate-0.3.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for cached_interpolate-0.3.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 ddb1bd7e21b50f4274f3244327be1d88dffb4925f4aa9a0fd7e9a203146ba8ee
MD5 2711b189dbee58734345ea003c1c8ae6
BLAKE2b-256 0c4717fd37eaa22336ebccd684d74cfbb38a8095cd776f315adb3de0d340bfd5

See more details on using hashes here.

File details

Details for the file cached_interpolate-0.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for cached_interpolate-0.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bdc0ce58bef4f594deba1b5c1525ebceeb5786cab141a082491befd0ee94f5e8
MD5 519d10a96a2c21dfe4cd41db54787df9
BLAKE2b-256 fdf7dc00773d569edd2c62677b5dcb992a688785354cdbdd4a8e04f91abc2e79

See more details on using hashes here.

File details

Details for the file cached_interpolate-0.3.2-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for cached_interpolate-0.3.2-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 5b8ec80dfab81948f8106bec25fb65668b1151fa3aa79b8bc047e5b3d0bbb517
MD5 9485bd77d4fc229afbb39b4c42ca4159
BLAKE2b-256 a727dccb4acb1c1c51c1834d1633d4f30c85699aec1e796303fb9f0411ca5619

See more details on using hashes here.

File details

Details for the file cached_interpolate-0.3.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for cached_interpolate-0.3.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b377ca100304cec48f36be8bd7f9f115ef7a74c577dfef9df894004074844a76
MD5 3de3b01ac06f11bd52191a5a78e39fbb
BLAKE2b-256 e2638ec77cf3b6dfc5d2ae434b0a2c1ded942276189edc86a2f22e7a6c8756f1

See more details on using hashes here.

File details

Details for the file cached_interpolate-0.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for cached_interpolate-0.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 557a32c7441d8221d1ddbb5749741e24ab4511b3267aebfda02fef106f24a047
MD5 cfe309e8535cbaebdcdfcc3b877e8654
BLAKE2b-256 71b7666e5e70542f1cdd5b4f0e8df25c53e8f4ed1ecf59d8247af49477dd7380

See more details on using hashes here.

File details

Details for the file cached_interpolate-0.3.2-cp310-cp310-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for cached_interpolate-0.3.2-cp310-cp310-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 0b6d0cc36879ab1fffa8e6baf60f7f4cc79444e803992b0ae986f612d81b61b5
MD5 d07a837b9c90e87ff7326085c17d747b
BLAKE2b-256 a951b0a102694843ed55b0fe8bb0a07a248c8b9c5b8062c1fa5cd5901199c06b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page