Skip to main content

Quantum Relative Entropy + Petz Recovery toolkit on Apple Silicon via MLX

Project description

mlx-qre

Quantum Relative Entropy + Petz Recovery toolkit on Apple Silicon via MLX.

$$\Sigma = D(\rho | \sigma) = \mathrm{Tr}[\rho(\ln\rho - \ln\sigma)]$$

A complete, self-contained library for quantum information quantities (QRE, von Neumann entropy, Rényi / JSD), quantum channels (thermal attenuator, depolarizing, dephasing) and Petz recovery analysis (recovery map, fidelity, retrodiction). Built as the computational companion to the Σ = 2 ln Q entropy-production framework — see petz-recovery-unification and tau-chrono.

Performance note. Eigendecomposition runs on the Metal GPU via MLX. For small matrices (N < 500), NumPy + Accelerate (CPU) is typically faster due to lower dispatch overhead — use NumPy if you only need a handful of small QREs. The GPU path becomes useful for batched evaluation and N ≥ 500, where it pulls ahead of NumPy. See benchmark_results.md for the full breakdown.

Installation

pip install -e .

Requires Python 3.10+ and Apple Silicon (M1/M2/M3/M4).

Quick Start

import mlx.core as mx
from mlx_qre import quantum_relative_entropy, random_density_matrix

# Two 100x100 density matrices on GPU
rho = random_density_matrix(100)
sigma = random_density_matrix(100)

# Compute D(rho || sigma) — eigendecomposition runs on Metal GPU
D = quantum_relative_entropy(rho, sigma)
mx.eval(D)
print(f"D(rho || sigma) = {D.item():.6f}")

# Batched: 500 pairs simultaneously
rho_batch = random_density_matrix(50, batch_size=500)
sigma_batch = random_density_matrix(50, batch_size=500)
D_batch = quantum_relative_entropy(rho_batch, sigma_batch)

Stochastic Lanczos backend (large N)

For N >= 1000 the O(N^3) eigendecomposition is the bottleneck and the Metal eigh kernel can hit GPU command-buffer timeouts past N = 2000. mlx-qre ships a Stochastic Lanczos Quadrature (SLQ) estimator that replaces the eigendecomposition with O(k * m * N^2) matvecs.

This implementation runs entirely on MLX (no NumPy fallback in the hot path). All m probe vectors are stacked as a single (N, m) matrix and each Lanczos step is a single block matmul A @ V, which both amortises GPU dispatch overhead and lets MLX tile the m probes across the GPU. Only the small (k, k) tridiagonal eigh is delegated to MLX's CPU eigh stream (k <= 30 -- CPU is the right place for it).

import mlx.core as mx
from mlx_qre import (
    random_density_matrix,
    quantum_relative_entropy,
    von_neumann_entropy_lanczos,
    quantum_relative_entropy_lanczos,
)

rho = random_density_matrix(2000)
sigma = random_density_matrix(2000)

# Direct API
S = von_neumann_entropy_lanczos(rho, k=25, m=20, seed=0)
D = quantum_relative_entropy_lanczos(rho, sigma, k=25, m=20, seed=0)

# Or via the main API with method="lanczos"
D_mx = quantum_relative_entropy(rho, sigma, method="lanczos", k=25, m=20, seed=0)

Typical accuracy at default k=25, m=20:

  • S(rho): ~1-2% median relative error (clean SLQ on a single matrix)
  • D(rho || sigma): ~3-7% median relative error (cross-term has higher variance because Tr[rho ln sigma] is not a single-matrix spectral function). Increase m for tighter accuracy.

Speed crossover (M1 Max, dense complex Haar-random density matrices, SLQ at k=25, m=20):

N MLX eigh (ms) NumPy eigh (ms) SLQ pure-MLX (ms)
100 4 4 19
500 160 287 20
1000 1077 1988 24
2000 timeout 24048 37

SLQ wins from N >= 500 and the gap blows up at N >= 1000 (~50-80x vs MLX eigh, ~80-660x vs NumPy eigh). The pure-MLX block matvec hot path is also ~30-140x faster than the previous NumPy-Accelerate hot path on the same machine. See benchmark_results.md for the full breakdown.

Features

Module Function Description
qre quantum_relative_entropy(rho, sigma, method="exact"|"lanczos") D(rho || sigma) via Metal eigendecomposition or SLQ
qre von_neumann_entropy(rho) S(rho) = -Tr[rho ln rho]
qre relative_entropy_pure_state(psi, sigma) Efficient D for pure states: -ln(psi|sigma|psi)
lanczos von_neumann_entropy_lanczos(rho, k, m) S(rho) via Stochastic Lanczos Quadrature
lanczos quantum_relative_entropy_lanczos(rho, sigma, k, m) D via Hutchinson + Lanczos
lanczos stochastic_lanczos_logtr(A, k, m) Tr[A ln A] for any Hermitian PSD A
classical kl_divergence(p, q) Classical KL divergence
classical jensen_shannon_divergence(p, q) Symmetric JSD
classical renyi_divergence(p, q, alpha) Renyi divergence of order alpha
channels thermal_attenuator(eta) Gravitational channel eta = -g_00
channels channel_entropy_production(K, rho, sigma) Sigma through channel
channels depolarizing_channel(p) Depolarizing noise
channels dephasing_channel(gamma) Dephasing noise
petz petz_recovery_map(K, sigma) Construct Petz recovery R
petz petz_recovery_fidelity(rho, sigma, K) F(rho, R o N(rho))
petz verify_petz_bound(rho, sigma, K) Check F >= exp(-Sigma/2)
petz retrodiction_quality(rho, sigma, K) tau = 1 - F

Use Cases

  • Gravitational entropy production: Sigma_grav = D(N_eta(rho) || N_eta(sigma)) with eta = 1/Q^2
  • Quantum channel analysis: entropy production, data processing inequality
  • Petz recovery bounds: F >= exp(-Sigma/2), retrodiction quality
  • Quantum ML: kernel methods using QRE as a distance measure
  • Neural entropy: EEG/neural signal entropy production analysis

Benchmark

python -m mlx_qre.benchmark

Compares MLX (Apple Silicon GPU) vs NumPy (CPU) across matrix sizes N = 10 to 1000. Summary on M1 Max:

N MLX (ms) NumPy (ms) MLX vs NumPy
10 0.74 0.04 0.06× (NumPy wins)
100 3.95 3.57 0.90×
500 158 278 1.76×
1000 1042 2010 1.93×

For batched QRE at N=100 the GPU sustains ~460 pairs/sec. Use NumPy for one-off small problems; reach for mlx-qre for batched / large-N work and for the integrated Petz / channel utilities below. See benchmark_results.md for the full table.

Tests

pip install -e ".[dev]"
pytest tests/ -v

Theory

The quantum relative entropy D(rho || sigma) is the quantum generalization of KL divergence. In the retrocausality framework:

  • Sigma = 2 ln Q: unified entropy production formula
  • Petz bound: F >= exp(-Sigma/2) quantifies retrodiction cost
  • tau = 1 - F: retrodiction deficit (0 = perfect, 1 = irreversible)
  • Zero-entropy limit: Sigma -> 0 implies perfect retrodiction (no time arrow)

License

MIT License. Copyright (c) 2026 Sheng-Kai Huang.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlx_qre-0.2.0.tar.gz (32.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlx_qre-0.2.0-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file mlx_qre-0.2.0.tar.gz.

File metadata

  • Download URL: mlx_qre-0.2.0.tar.gz
  • Upload date:
  • Size: 32.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlx_qre-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d7e32eec37f4e6cfaab0a804f717519a1abb110eab35f40c57b10fcd685f818f
MD5 ea4367f2be536955ac606f671e765fd9
BLAKE2b-256 45f8ee331bb3ce15895b4f6e53b094d4b9af42b791e2467031edde1749742f10

See more details on using hashes here.

File details

Details for the file mlx_qre-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: mlx_qre-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mlx_qre-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b5ea60e04d7fb336fbcbce8ee37fc8c51ed6b0c55a304a8e9d17a17304733307
MD5 a7415f445116aa79d7cffe6c14594dab
BLAKE2b-256 3e46a9bf605f32f1d12638007fa4f9502d639fd8e97c6075b6ce2e6fce9d4c37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page