Skip to main content

GPU-accelerated spherical Bessel functions for Apple Silicon using MLX

Project description

mlx-bessel

GPU-accelerated spherical Bessel functions j_l(x) and j_l'(x) for Apple Silicon, using MLX.

What it does

Evaluates spherical Bessel functions of the first kind and their derivatives on Apple GPU via piecewise Chebyshev interpolation. The table is built once (using a hybrid forward-recurrence + scipy strategy on CPU), then stored as GPU tensors for fast repeated evaluation.

Installation

pip install mlx-bessel

Requires macOS with Apple Silicon (M1/M2/M3/M4).

Quick start

import numpy as np
from mlx_bessel import BesselTable

ells = np.arange(0, 2001)               # multipole values
table = BesselTable(ells, x_max=5500)    # build table (~2s for 500 ells)
x = np.linspace(1.0, 5000.0, 10000)     # evaluation points

jl = table.eval_jl(x)                   # shape (2001, 10000), on GPU
jl, jlp = table.eval_jl_jlp(x)          # j_l and j_l' together

Results are returned as mlx.core.array. Convert to numpy with np.array(jl).

Performance

Benchmarked on Apple M1 Max. Median of 5 runs, warm-up excluded.

N_ell N_x scipy GPU eval Speedup (eval) Speedup (incl. build)
100 1000 0.05 s 0.001 s 34x 0.8x
100 5000 0.22 s 0.004 s 57x 3.8x
200 5000 1.20 s 0.007 s 174x 3.9x
200 10000 2.35 s 0.013 s 181x 7.7x
500 5000 8.95 s 0.016 s 557x 1.7x
500 10000 17.82 s 0.031 s 567x 3.4x
525 10000 19.63 s 0.034 s 579x 3.1x

The table build is a one-time cost (~0.05 s for 100 ells, ~5 s for 500 ells). Once built, subsequent evaluations at any x-array use GPU-only and achieve significant speedups, reaching over 500x for large problems.

Accuracy

Tested against scipy.special.spherical_jn across l = 0..2000, x = 0.5..5000 (155 sampled ells, 5000 x-points):

Metric Value
Max absolute error 6.3e-07
Median absolute error 1.1e-08
Max relative error (|j_l| > 1e-5) 1.1e-02
Median relative error 5.1e-05
P99 relative error 2.4e-03

Float32 GPU precision limits relative accuracy near zero-crossings of j_l. For physically relevant values (|j_l| > 1e-5), the relative error is below 1.2%.

Method

  1. Piecewise segments: [0, x_max] is divided into segments of width ~80.
  2. Chebyshev nodes: 64 Chebyshev nodes per segment.
  3. Hybrid table build (CPU):
    • Forward recurrence for the stable regime (x > 1.5l)
    • scipy for the transition zone (x ~ l, ~14% of node pairs)
    • Zero for the evanescent regime (x << l)
  4. DCT to coefficients: Discrete cosine transform converts node values to Chebyshev expansion coefficients.
  5. GPU evaluation: Segment lookup + Chebyshev basis matrix multiply, fully vectorized on GPU.

Running benchmarks

python -m mlx_bessel.benchmark

Running tests

pip install pytest scipy
pytest tests/ -v

Author

Sheng-Kai Huang (akai@fawstudio.com)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_bessel-0.1.0.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_bessel-0.1.0-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file mlx_bessel-0.1.0.tar.gz.

File metadata

  • Download URL: mlx_bessel-0.1.0.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlx_bessel-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4d1f23bb1bacd6fd315e95872f42c43e5557a1a92b75d0b30619ce266fe91ad5
MD5 5f690bf43a49f58f4c0d36b349478c51
BLAKE2b-256 6ac0cbee96820ca5a79e9ea412fa0c6a48dc5c347c9ed55b14188f1e133e428b

See more details on using hashes here.

File details

Details for the file mlx_bessel-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mlx_bessel-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlx_bessel-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 79e3511f3f61ec4956a3e476d817fcaea2d610fbc3e1fbb8fffb74786023b6f0
MD5 f5ce3293707963228cd20c17deb5f61a
BLAKE2b-256 286607c54c5b4d1474039b6eb649e4ce8cbb5f512ffece97b6feddf2869cbd55

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page