Skip to main content

JAX gyrokinetic solver with Hermite-Laguerre velocity space

Project description

SPECTRAX-GK

Release PyPI CI License: MIT Python Coverage

SPECTRAX-GK is a JAX-native gyrokinetic solver designed for differentiability, accelerator-ready execution, and stellarator-optimization research workflows. The code employs a Hermite-Laguerre velocity space, Fourier perpendicular coordinates, and field-aligned flux-tube geometry to simulate linear and nonlinear electrostatic and electromagnetic turbulence in magnetized plasmas. The validated release claim is narrower than the full feature surface; use the claim-scope ledger below before citing benchmark, quasilinear, autodiff, refactor, or manuscript results.

Installation

pip install spectraxgk

or install the development checkout directly:

git clone https://github.com/uwplasma/SPECTRAX-GK
cd SPECTRAX-GK
pip install -e .

Quickstart (Executable)

# Run the built-in default example.
spectraxgk

# The hyphenated entry point works too.
spectrax-gk

# Run directly from a checked-in TOML.
spectraxgk examples/linear/axisymmetric/cyclone.toml

# Compute linear quasilinear transport weights and write JSON/CSV artifacts.
spectraxgk run-runtime-linear \
  --config examples/linear/axisymmetric/runtime_cyclone_quasilinear.toml \
  --out tools_out/cyclone_quasilinear

# Write a restartable nonlinear NetCDF bundle.
spectraxgk run-runtime-nonlinear \
  --config examples/nonlinear/axisymmetric/runtime_cyclone_nonlinear.toml \
  --steps 200 \
  --out tools_out/cyclone_release.out.nc

# Replace the VMEC equilibrium used by a VMEC-backed TOML (model = "vmec").
spectrax-gk run \
  --config examples/nonlinear/non-axisymmetric/runtime_hsx_nonlinear_vmec_geometry.toml \
  --vmec-file /path/to/wout_HSX_QHS_vacuum_ns201.nc \
  --out tools_out/hsx_run

# Replace the imported EIK/NetCDF geometry used by an imported-geometry TOML.
# --geometry-file does not switch model="vmec" into imported-geometry mode;
# the TOML should already use model="vmec-eik", "gx-eik", or "gx-netcdf".
spectrax-gk run \
  --config examples/nonlinear/non-axisymmetric/runtime_w7x_nonlinear_imported_geometry.toml \
  --geometry-file /path/to/w7x_adiabatic_electrons.eik.nc \
  --out tools_out/w7x_run

# Turn any saved runtime bundle into a polished figure.
spectraxgk --plot tools_out/cyclone_release.out.nc
spectraxgk --plot tools_out/spectraxgk_default_linear.summary.json

Running spectraxgk with no TOML starts the default Cyclone linear example (equivalent to the standard examples/linear/axisymmetric/cyclone.toml surface), prints the fitted growth rate and frequency to the terminal, and writes a two-panel figure to tools_out/spectraxgk_default_linear.png. The left panel shows the linear |\phi|^2 history on a log scale with the fitted (\gamma, \omega) annotation. The right panel shows the normalized real and imaginary eigenfunction.

When progress output is enabled, the executable prints live status lines with step/time progress, wall elapsed time, and an estimated wall-clock time remaining. Adaptive nonlinear runs also emit chunk-level elapsed/ETA updates.

The --plot mode reads saved runtime artifacts directly:

  • linear bundles: *.summary.json + *.timeseries.csv + *.eigenfunction.csv
  • nonlinear bundles: *.summary.json + *.diagnostics.csv or *.out.nc

Linear plots reproduce the two-panel growth/eigenfunction layout. Nonlinear plots produce a three-panel diagnostic view with field amplitude/energy, resolved diagnostics, and heat flux.

Highlights

  • Differentiable JAX-native kernels for gradient-based optimization and sensitivity analysis.
  • Hermite-Laguerre spectral velocity basis providing efficient kinetic closures and multi-fidelity modeling.
  • Accelerator-ready execution on CPUs and GPUs with JIT compilation.
  • Flexible geometry interface supporting analytic s-alpha, Miller, and direct VMEC equilibrium imports.
  • Electromagnetic field-channel support including $(\phi, A_\parallel, B_\parallel)$ fluctuations, with validation claims limited to tracked release lanes.
  • Multi-species support with kinetic electrons and advanced collision operators.
  • Quasilinear transport diagnostics from linear states, with explicit saturation-rule metadata and electrostatic channel validation gates.
  • Automated benchmark workflows for reproducible validation and regression tracking.
  • Modular runtime/refactor surfaces with focused tests for restart artifacts, diagnostics, validation gates, and public API boundaries.

Current claim scope

The current release surface is deliberately scoped:

  • Linear and nonlinear benchmark claims are tied to tracked gates and figures under docs/_static.
  • The large runtime/diagnostic refactor is an infrastructure claim: extracted runtime startup/chunk/result helpers, validation-gate helpers, and restart artifact schema tests preserve public behavior. It is not a new physics, validation, nonlinear-optimization, or speedup claim.
  • Electrostatic quasilinear weights and spectra are validated diagnostics. The one-constant and simple saturation-rule absolute-flux models are rejected on the current train/holdout portfolio; the spectral_envelope_ridge result is a scoped manuscript model-selection candidate, not a runtime/TOML absolute-flux predictor. Electromagnetic quasilinear calibration remains deferred.
  • The vmec_jax -> booz_xform_jax -> SPECTRAX-GK path is artifact-bound: zero-beta equal-arc geometry parity is claimable for the rows that pass the current mboz=nboz=21 parity matrix, and reduced linear/quasilinear/nonlinear-window-estimator gradients are claimable only on the tracked QH/Li383 gates. A half-mesh Boozer radial-index convention fix restored the fixed-resolution QI row (drift=7.13e-2 < 8e-2) and the evaluated QI robustness variants at ntheta=8,16 now pass. The broader QI seed campaign remains artifact-limited because three input-only QI seeds have no bundled wout references; this is not broad QI transport validation, QI quasilinear calibration, or QI nonlinear optimization. The actual nonlinear finite-difference audits are startup plumbing checks with transport_average_gate = false; they are not production turbulence-gradient or nonlinear heat-flux optimization claims.
  • Production parallelization is currently the independent-work path for k_y scans, sensitivity sweeps, quasilinear studies, and UQ ensembles. Whole-state nonlinear sharding is identity-correct but remains a profiler/diagnostic path, not a nonlinear multi-GPU speedup claim.
  • W7-X zonal long-window recurrence/damping and W7-X TEM / kinetic-electron extensions are deferred from the current manuscript/release scope.

The detailed claim ledger is in docs/release_scope.rst.

SPECTRAX-GK linear benchmark panel

SPECTRAX-GK nonlinear benchmark panel

The figures above represent the validated benchmark suite, covering linear microinstabilities and nonlinear transport across diverse magnetic configurations. The shipped nonlinear atlas emphasizes the longest archived windows currently tracked in the repo: KBM to about t=400, W7-X to about t=200, and Cyclone Miller to about t=122. HSX is currently archived on the closed t=50 window; no longer-window HSX nonlinear audit artifact is currently tracked for the release panel.

Quasilinear transport diagnostic example:

SPECTRAX-GK quasilinear Cyclone spectrum

This panel is generated from examples/linear/axisymmetric/runtime_cyclone_quasilinear.toml. It shows linear growth/frequency, eigenfunction-weighted k_perp, amplitude-normalized heat/particle flux weights, and an explicitly uncalibrated mixing-length output. The absolute saturated-flux claim remains gated on nonlinear train/holdout calibration. The first Cyclone nonlinear audit is tracked in docs/quasilinear.rst and is kept at training_or_audit_only until a held-out calibration set passes.

The manuscript-facing quasilinear calibration panel now uses the full admitted electrostatic portfolio: two training geometries and six held-out nonlinear windows spanning tokamak, stellarator, and external-VMEC cases.

SPECTRAX-GK quasilinear stellarator train/holdout calibration

The current training set is Cyclone plus the external-VMEC ITERModel case; the holdouts are Cyclone Miller, HSX, W7-X, D-shaped VMEC, up-down asymmetric VMEC, and circular VMEC. This is a stronger transfer test than the earlier Cyclone-only fit: nonlinear input validation now passes, but the fitted one-constant mixing-length model still fails the held-out absolute-flux gate with mean relative error about 2.11. The circular holdout itself is predicted well by the scaled one-constant diagnostic, but the aggregate model remains blocked by the other held-out cases. The best current one-scalar saturation rule remains worse than the training-mean null baseline (2.11 versus 1.20), so SPECTRAX-GK does not promote any simple or user-facing absolute quasilinear flux predictor from that legacy family.

The richer held-out candidate is now the reduced spectral_envelope_ridge model below. It uses only two linear-spectrum envelope features, reaches mean relative error about 0.295, and clears the leave-one-geometry-out interval-coverage gate at 7/8 on the current eight-case electrostatic portfolio. That is the current manuscript model-selection result: the simple rules are rejected, but a small spectrum-aware candidate is accepted as a scoped research candidate. The model-selection status also consumes the selected optimized-equilibrium nonlinear audit as local transport evidence, but it is not a runtime/TOML absolute-flux predictor or universal saturation law.

SPECTRAX-GK quasilinear candidate uncertainty gate

The companion holdout-gap report makes the remaining promotion blocker explicit instead of hiding it in the calibration plot. Six holdouts are admitted and the scoped model-selection gate passes, but the current absolute heat-flux calibration still fails the aggregate holdout gate (2.11 > 0.35). The next useful data product is therefore another independent, converged electrostatic nonlinear holdout, preferably in the external-VMEC family, not another unvalidated fit parameter.

SPECTRAX-GK quasilinear holdout gap report

The runbook below converts that gap into a fail-closed nonlinear launch plan. It is a planning artifact only: admission still requires the resulting post-transient traces to pass the grid/window convergence gate and enter the calibration metadata as split = holdout.

SPECTRAX-GK external-VMEC next holdout runbook

The latest new-family shaped-tokamak pressure candidate was run to t=450 on the office GPUs at 48x48x32 and 64x64x40. It is finite and late-window stable, but it is not admitted: the two grid levels differ by about 0.306 in both common-window and least-window heat-flux means, above the 0.15 convergence gate. The runbook now demotes unchanged reruns of that failed family. The follow-on ITERModel t=450 same-family audit passed (0.056/0.055 common/least grid differences), so the runbook no longer relaunches that unchanged audit; it records that the next useful data product must be a different independent electrostatic VMEC holdout or a materially changed high-resolution protocol.

Two of the strongest admitted external-VMEC nonlinear holdouts are shown below. These figures are part of the publication-facing evidence that the nonlinear inputs are converged enough to be used as negative transfer constraints rather than as exploratory pilots.

SPECTRAX-GK ITERModel external-VMEC nonlinear convergence gate

The ITERModel external-VMEC case closes at t=350 on the 48x48x32 to 64x64x40 ladder. Its common-window grid difference is about 0.0165, the least-window difference is about 0.1415, and the trend/CV/sample-count gates all pass.

SPECTRAX-GK up-down asymmetric external-VMEC nonlinear convergence gate

The up-down asymmetric external-VMEC tokamak closes at t=450 on the same ladder. Its common-window and least-window relative differences are about 0.0435 and 0.0242, respectively.

SPECTRAX-GK circular external-VMEC nonlinear convergence gate

The circular external-VMEC tokamak initially failed the shorter t=150 and t=250 admission gates, then closes at t=450 on the same high-grid ladder: the common-window and least-window grid differences are about 0.0128 and 0.0468. These admitted windows strengthen the quasilinear calibration dataset without changing the core conclusion: the current absolute-flux model is still a rejected research candidate, not a shipped predictive transport law.

The follow-up seed/timestep replicate gate initially failed at t=450 because one seed still had a drifting terminal window. Extending the same three replicas to t=700 closes the physical readiness gate on t=[350,700]: the ensemble mean heat flux is 18.97, mean relative spread is 0.035, and combined SEM/mean is 0.043.

SPECTRAX-GK circular VMEC nonlinear replicate gate

Autodiff validation (inverse/sensitivity demo):

SPECTRAX-GK autodiff inverse demo

This single-mode figure checks that the JAX derivatives are correct and shows how one measured mode constrains the gradients locally. The expected outcome is small observable and Jacobian errors, not exact parameter recovery; the shipped result is a near-perfect match in (γ, ω) but a visibly non-unique recovered (R/L_Ti, R/L_n) pair.

Autodiff validation (two-mode inverse demo):

SPECTRAX-GK autodiff two-mode demo

This two-mode figure is the actual parameter-recovery validation, where the goal is to recover the planted gradients from two independent mode observables. The shipped result reaches the target to numerical precision and the autodiff Jacobian matches finite differences, which is the behavior expected from an identifiable inverse problem.

Single-point runtime TOMLs can also carry their own artifact prefix:

[output]
path = "tools_out/runtime_case"

The executable --out flag overrides the TOML value when both are present.

The shipped nonlinear W7-X and HSX runtime TOMLs already set this lightweight artifact prefix, so long stellarator parity runs leave tools_out/... diagnostics and summaries behind without extra command-line flags. The direct Python case wrappers now honor that TOML output contract as well, so chunked nonlinear runs persist their evolving diagnostics through the same path.

When the nonlinear target ends in .out.nc or another .nc suffix, SPECTRAX-GK writes a restartable NetCDF bundle, compatible with the comparison tooling, instead of the lightweight JSON/CSV sidecars:

  • case.out.nc: resolved nonlinear diagnostics and metadata
  • case.big.nc: final fields and moments in real and spectral layouts
  • case.restart.nc: restart state for continuation runs

The same runtime input can then resume from the saved restart file by setting restart controls in the TOML:

[time]
nstep_restart = 100

[output]
path = "tools_out/cyclone_release.out.nc"
restart_if_exists = true
save_for_restart = true
append_on_restart = true
restart_with_perturb = false

With that configuration, rerunning the same command resumes from tools_out/cyclone_release.restart.nc when it already exists and appends the new samples to tools_out/cyclone_release.out.nc. Restart appends preserve the persisted NetCDF diagnostic schema; transient in-memory traces that are not written to .out.nc are not reintroduced when an existing artifact is loaded for continuation.

Quickstart (Python)

from spectraxgk import CycloneBaseCase, LinearParams, integrate_linear_from_config
from spectraxgk.geometry import SAlphaGeometry
from spectraxgk.grids import build_spectral_grid
import jax.numpy as jnp

cfg = CycloneBaseCase()
grid = build_spectral_grid(cfg.grid)
geom = SAlphaGeometry.from_config(cfg.geometry)
params = LinearParams()

G0 = jnp.zeros((2, 2, grid.ky.size, grid.kx.size, grid.z.size), dtype=jnp.complex64)
G0 = G0.at[0, 0, 0, 0, :].set(1.0e-3 + 0.0j)

G_t, phi_t = integrate_linear_from_config(G0, grid, geom, params, cfg.time)

Autodiff demo and parallelization notes

The autodiff inverse/sensitivity example lives at examples/theory_and_demos/autodiff_inverse_growth.py and generates the figure shown above. It uses JAX autodiff on a short linear ITG window, reports gradients against a finite-difference check, and writes a summary JSON plus parameter sweeps for both R/L_Ti and R/L_n alongside the plot. The single-mode panel should be read as a local inverse demo, not as a global identifiability claim; in the shipped figure the observable errors are small while the parameter errors remain finite for exactly that reason. The two-mode inverse example in examples/theory_and_demos/autodiff_inverse_twomode.py uses two ky modes to stabilize the inverse problem and provides the release-grade parameter recovery panel, closing the identifiability gap present in the single-mode demo. Both autodiff examples now report finite-difference Jacobian checks, Jacobian rank/conditioning, covariance, standard deviations, correlations, and one-sigma UQ ellipse area in their summary JSON files.

The differentiable geometry bridge example lives at examples/theory_and_demos/differentiable_geometry_bridge.py and writes the publication artifact below. It validates the in-memory vmec_jax/booz_xform_jax bridge contract used by stellarator optimization workflows: solver-ready field-line arrays remain JAX-traceable, geometry observable sensitivities match central finite differences, a two-parameter inverse design recovers the target observables, and the local UQ covariance is reported. When vmec_jax is available, the same artifact also checks a real VMEC boundary-aspect derivative through its boundary Fourier API and real VMEC metric-tensor derivatives through vmec_jax.geom.eval_geom. It also samples a real stellarator VMEC field line from vmec_jax metric and magnetic-field tensors to check that state-level geometry sensitivities reach field-line observables before any SPECTRAX-GK closure approximation is introduced. The same path now emits a direct VMEC tensor-derived SPECTRAX-GK flux-tube mapping and checks its geometry-observable sensitivities against finite differences, so the differentiability chain starts at vmec_jax state coefficients rather than only at a Boozer spectral adapter. The validation artifact also records a direct-VMEC-tensor vs imported-VMEC/EIK array-parity audit. A new vmec_jax -> booz_xform_jax Boozer equal-arc core audit now matches the imported convention for bmag, bgrad, gradpar, q, s_hat, and the solver Jacobian at the percent level on the tracked stellarator fixture; the same audit now reconstructs the zero-beta Boozer metric profiles gds*/grho with worst normalized mismatch 3.45e-2 and the loaded-convention zero-beta drift profiles cvdrift/gbdrift/cvdrift0/gbdrift0 with worst normalized mismatch 3.50e-2. The remaining geometry promotion work is finite-beta and broader production-runtime drift parity beyond the tracked zero-beta equal-arc fixtures. When booz_xform_jax is available, it also runs a bounded JAX-native Boozer spectral transform, samples the resulting Boozer |B| spectrum onto a field-line flux-tube mapping, and checks both derivative paths against central finite differences. When both optional backends are available, the artifact also starts from a real vmec_jax VMECState, perturbs VMEC Fourier coefficients, converts that state through booz_xform_jax, and differentiates the resulting SPECTRAX-GK field-line geometry observables against central finite differences. The remaining promotion gate is exact production drift parity with the imported VMEC/EIK runtime path and then multi-equilibrium transport-gradient and nonlinear-window gates through the solver.

SPECTRAX-GK differentiable geometry bridge

A separate mode-21 parity matrix checks the same Boozer equal-arc path on the tracked QH, fixed-resolution QI, and shaped-tokamak fixtures. The matrix is generated by tools/build_vmec_boozer_parity_matrix.py and enforces mboz,nboz >= 21. The current regenerated artifact passes all matrix rows. The evaluated QI robustness variants (ntheta=8,16) pass, while three QI input seeds remain explicitly marked missing_bundled_wout_reference rather than being silently promoted. This is a field-line geometry convention gate, not a production stellarator-transport-gradient claim.

SPECTRAX-GK VMEC/Boozer parity matrix

The solver-objective geometry-gradient gate differentiates actual electrostatic linear-RHS eigenpair observables with respect to solver-ready geometry arrays and checks the implicit left/right eigenpair sensitivities against central finite differences. This closes the production solver contract for FluxTubeGeometryData gradients. The companion full-chain gate starts from a real vmec_jax state coefficient, maps through booz_xform_jax with mboz=nboz=21, builds the SPECTRAX-GK linear RHS, and verifies the linear eigenfrequency gradient against central finite differences. The full-chain quasilinear gate uses a richer Nl=2, Nm=3 moment basis and checks gamma, omega, <k_perp^2>, the electrostatic heat-flux weight, and gamma Q_i/k_perp^2 against central finite differences with maximum relative error 4.3e-3. These are differentiability checks on reduced solver observables and an uncalibrated heat-flux proxy, not calibrated absolute-flux predictions. This closes the reduced linear/quasilinear stellarator objective-gradient path on the tracked all-surface QH fixture. A second Li383 holdout now passes the same frequency and quasilinear VMEC/Boozer gradient contracts at mboz=nboz=21; the combined holdout matrix has maximum relative AD/finite-difference mismatch 4.9e-3 across the reduced linear/quasilinear objectives. Companion QH and Li383 reduced nonlinear-window estimator gates differentiate a smooth late-window heat-flux envelope through the same vmec_jax -> booz_xform_jax -> SPECTRAX-GK state path; the expanded matrix including those estimator rows has maximum relative mismatch 2.7e-2. That closes a multi-equilibrium bounded differentiability check for nonlinear-window-style reduced objectives, but it is not a converged nonlinear turbulence-gradient or optimized-equilibrium transport claim.

A compact nonlinear finite-difference audit now runs actual SPECTRAX-GK nonlinear Cyclone startup windows at R/LTi = base +/- step plus a repeated base run. It passes finite-output, repeatability, monotonic drive-response, startup-window CV/trend, and resolved finite-difference-response gates with response/base about 0.111. This is only a startup-response plumbing and conditioning check. It is not a production heat-flux average, VMEC/Boozer nonlinear state-gradient, or optimized-equilibrium transport claim.

A companion VMEC/Boozer-perturbed audit starts from the real mode-21 vmec_jax -> booz_xform_jax QH state bridge, writes perturbed sampled geometries to temporary NetCDF files, and runs compact nonlinear startup windows at Rcos_mid_surface_m1 = base +/- 1e-5. It passes finite-output, repeatability, startup-window conditioning, geometry-response, and resolved central finite-difference response gates with response/base about 0.040. The forward/backward response is asymmetric and not monotone, so this is only a VMEC/Boozer geometry-perturbed startup observable-path audit. It is not promoted as a local nonlinear gradient, optimized-equilibrium audit, or production heat-flux stellarator optimization claim. A memory-bounded Boozer surface stencil exists for diagnostics and large-equilibrium probes, but it is not used for the published linear/quasilinear accuracy claim.

For nonlinear transport claims, heat flux must be measured as a long-time post-transient running average. The gate for future production heat-flux optimization requires discarding the initial transient, retaining enough post-transient samples, checking that the cumulative running mean and independent late blocks are stable, and comparing the same late window against the tracked nonlinear reference cases. The short FD audits above explicitly record transport_average_gate = false to avoid treating startup-scale fluxes as saturated transport.

SPECTRAX-GK solver-objective geometry-gradient gate

SPECTRAX-GK VMEC/Boozer solver-frequency gradient gate

SPECTRAX-GK VMEC/Boozer quasilinear-gradient gate

SPECTRAX-GK VMEC/Boozer reduced nonlinear-window-gradient gate

SPECTRAX-GK VMEC/Boozer gradient holdout matrix

The reduced VMEC/Boozer optimization path also has aggregate guardrails. The multi-point gate below checks a quasilinear objective over two field lines and two k_y samples at mboz=nboz=21; the growth-vs-quasilinear comparison shows that growth-rate and quasilinear objectives can choose different initial VMEC coefficient directions. The current promotion gate is therefore intentionally blocked until an independent production-grade held-out surface or field-line artifact passes. The alpha-heldout split shown below is a positive reduced field-line generalization check, but it is still not a nonlinear transport optimization claim. The surface-heldout split extends this to a true held-out surface_index, and the Li383 panel checks that the same aggregate finite-difference plus line-search machinery works on a second equilibrium.

SPECTRAX-GK VMEC/Boozer multi-alpha aggregate-objective gate

SPECTRAX-GK VMEC/Boozer growth-vs-quasilinear line-search comparison

SPECTRAX-GK VMEC/Boozer aggregate alpha-heldout gate

SPECTRAX-GK VMEC/Boozer aggregate surface-heldout gate

SPECTRAX-GK VMEC/Boozer second-equilibrium aggregate gate

The backend-free portfolio reducer below is the lightweight contract that multi-surface, multi-field-line, multi-k_y stellarator optimization drivers should satisfy before they call expensive VMEC/Boozer row producers. It checks normalized sample/objective weights and AD/JVP/finite-difference consistency for the aggregate scalar objective; it is not a VMEC/Boozer or nonlinear turbulent-transport optimization claim by itself.

SPECTRAX-GK stellarator objective portfolio reducer gate

SPECTRAX-GK nonlinear startup-window finite-difference audit

SPECTRAX-GK VMEC/Boozer nonlinear startup-window finite-difference audit

The nonlinear time-horizon audit below separates long post-transient transport windows from startup plumbing checks and reduced nonlinear-envelope examples. The external nfp4 QH pilot has now been extended to t=150, where its late heat-flux window is meaningful rather than noise-floor-scale; it remains a feasibility result because the 48x48x32 grid check changes the late heat-flux level by about 52%, and the follow-on 64x64x40 check changes it again by about 63%. QH is therefore excluded from quasilinear calibration until a separate grid/window-converged transport gate passes. A new D-shaped tokamak external-VMEC candidate now passes the longer t=250 high-grid gate: 48x48x32 and 64x64x40 differ by 13.9% on the common late window and 10.8% on independently selected least-trending windows. A follow-up seed/timestep replicate campaign on the 64x64x40, t=250 D-shaped case passes the late-window ensemble gate on t=[170,250]: the three accepted windows have mean heat fluxes 18.8, 20.8, and 18.1, with mean relative spread 0.141 below the 0.15 gate. A circular external-VMEC replicate campaign required a longer horizon: the t=450 ensemble spread was already small, but seed31 failed terminal-window stationarity, so the accepted artifact is the t=700, t=[350,700] replicate with mean relative spread 0.035 and combined SEM/mean 0.043. The selected optimized QA equilibrium was then run through the same long-window protocol at n64 with two seed replicates and one timestep replicate. Its accepted t=[350,700] window has ensemble mean ion heat flux 10.19, mean-relative spread 0.038, and combined SEM/mean 0.021. This is a passed post-transient optimized-equilibrium audit; it is not a universal absolute-flux model and should be compared case-by-case against the chosen baseline objective and geometry family.

SPECTRAX-GK nonlinear transport time-horizon audit

SPECTRAX-GK D-shaped VMEC nonlinear replicate gate

SPECTRAX-GK circular VMEC nonlinear replicate gate

SPECTRAX-GK optimized-equilibrium nonlinear replicate gate

The matched no-ESS reference from the same vmec_jax QA campaign also passes the same t=[350,700] seed/timestep ensemble gate. Against that finite-transform reference, the optimized QA/ESS equilibrium reduces the late-window ion heat flux from 12.50 to 10.19, a relative reduction of 18.4% with 7.82 combined-SEMs separation.

SPECTRAX-GK matched no-ESS to optimized QA/ESS nonlinear audit

The production nonlinear optimization guard below is the enforced claim boundary. It passes as a release-safety check because startup/reduced nonlinear artifacts are scoped correctly and two long post-transient replicated holdout ensembles pass. With the optimized-equilibrium t=[350,700] seed/timestep replicate now attached, the selected optimized-equilibrium transport audit also satisfies this guard. The claim remains bounded: this proves that the selected optimized equilibrium has a converged replicated nonlinear transport-window audit, not that the current quasilinear model is a universal absolute-flux predictor or that nonlinear turbulence gradients are available. The separate nonlinear turbulence-gradient evidence gate is stricter and remains fail-closed after the completed QA/ESS overdetermined control campaign. All 27 matched t=900, n64:64:64:40:40 runtime outputs completed and all runtime-output gates passed. The best overdetermined candidate is RBC(1,1): it has a resolved response (response_fraction = 0.086) and passes the locality gate (fd_asymmetry_rel = 0.338), but it still misses the propagated uncertainty gate (gradient_uncertainty_rel = 0.559 > 0.5). ZBS(1,1) is statistically cleaner but nonlocal, and ZBS(1,0) is not response-resolved. The current release therefore documents nonlinear turbulence-gradient evidence as a rigorous negative/model-development result, not as a promoted production nonlinear-gradient or full nonlinear turbulent-flux optimization claim.

The next scientifically efficient step is not another blind single-coefficient rerun. The tracked ranking artifact recommends keeping the claim fail-closed, targeting the best local but noisy RBC(1,1) direction with additional independent replicas or variance reduction only if the cost is justified, and replacing or shrinking nonlocal controls before another production campaign. A companion composite-direction manifest defines a smaller descent-oriented QA/ESS boundary direction with the same long-window contract; that audit also remains fail-closed after its plus-state spread and central-FD gates.

SPECTRAX-GK production nonlinear optimization guard

SPECTRAX-GK QA/ESS ZBS(1,0) nonlinear gradient gate

SPECTRAX-GK QA/ESS overdetermined RBC(1,1) nonlinear gradient gate

SPECTRAX-GK QA/ESS composite nonlinear gradient gate

SPECTRAX-GK QA/ESS targeted nonlinear gradient follow-up

Differentiable stellarator ITG optimization examples live in examples/optimization/. They optimize the same QA, max-mode-1 control vector with three turbulence objectives: small linear ITG growth rate, small quasilinear ITG heat-flux proxy, and a small reduced late-window nonlinear heat-flux envelope. Each example reports AD-vs-finite-difference checks, UQ covariance diagnostics, objective histories, and polished figures.

SPECTRAX-GK differentiable stellarator ITG optimization

SPECTRAX-GK stellarator ITG optimization UQ diagnostics

The panel above is the current release-grade differentiability gate: all three objectives keep the optimized QA configuration near aspect ratio 7 and iota = 0.41 while reducing the tracked transport observables. It should be read together with the UQ panel, which verifies AD/FD derivative parity for each active control and estimates local Gauss-Newton covariance from the final weighted objective residual. These are validated optimization-plumbing diagnostics for stellarator-transport objectives, not a final absolute-flux optimization claim. Full vmec_jax -> booz_xform_jax -> SPECTRAX-GK nonlinear optimization remains scoped to the next promotion gate: production nonlinear turbulence-gradient or robust finite-difference audits with converged post-transient heat-flux windows, continued production curvature/drift parity on additional equilibria, and matched baseline-to-optimized nonlinear audits for broader geometry families. The current full-chain linear/quasilinear and reduced nonlinear-window estimator gradient evidence covers QH and Li383 at mboz=nboz=21; it should not be read as a production nonlinear heat-flux optimization claim.

For production parallelization of independent work, use spectraxgk.batch_map / spectraxgk.ky_scan_batches for ky scans, sensitivity sweeps, and UQ ensembles. Runtime k_y scans can select the same independent-worker path from TOML:

[parallel]
strategy = "batch"
axis = "ky"
num_devices = 4      # or batch_size = 4
backend = "auto"     # "thread" or "process" are explicit alternatives

This path preserves serial ordering and uses independent solver calls; it does not change the solver layout. Whole-state fixed-step nonlinear sharding through TimeConfig.state_sharding = "auto" (or "ky" / "kx") remains a correctness/profiler path for partitioning the packed state array across JAX devices. It is intentionally limited to state axes: sharding across the z FFT axis is tracked as a future domain-decomposition lane because it requires a separate communication/layout design. The current profiler-backed artifacts are docs/_static/nonlinear_sharding_profile.json for the local control-flow gate and docs/_static/nonlinear_sharding_profile_office_gpu.json for the two-GPU office identity gate. Treat both as engineering gates, not as runtime speedup claims. The matched large strong-scaling sweep in docs/performance.rst confirms this conservative stance: whole-state nonlinear sharding is identity-correct, but only modestly useful on logical CPUs and slower on two RTX A4000 GPUs for the current decomposition. Production parallelization should therefore focus on independent k_y scans, quasilinear studies, sensitivity sweeps, and UQ/ensemble batching until a communication-aware nonlinear decomposition has its own identity and throughput evidence.

For UQ and optimization portfolios, spectraxgk.independent_ensemble_provenance_gate is the production identity/provenance check. It runs serial and independent_map ensemble members, verifies result ordering and numerical identity, checks worker clipping for oversubscribed requests, validates deterministic reconstruction through the independent-work decomposition contract, and confirms worker-exception metadata.

SPECTRAX-GK ky-batch parallelization identity gate

The ky-batch gate above is generated by python tools/generate_parallel_ky_scan_gate.py. It runs the real Cyclone linear solver serially and with fixed-shape ky batching, verifies numerical identity for gamma and omega, and reports the observed batch speedup for engineering tracking.

SPECTRAX-GK independent ky scan strong scaling

The large independent-k_y strong-scaling panel uses the real Cyclone linear solver on 64 modes with Ny=128, Nz=96, Nl=4, Nm=8, and 240 RK2 steps per mode. It passes exact gamma/omega identity against the one-worker reference. The refreshed release artifact reaches 7.18x on eight local CPU workers and 1.88x on two RTX A4000 GPUs on ssh office. This is the preferred production parallelization path for linear scans, quasilinear studies, sensitivity sweeps, and UQ ensembles.

SPECTRAX-GK parallelization closure status

The closure status above is regenerated by python tools/build_parallelization_completion_status.py. It marks independent k_y scans and quasilinear/UQ ensembles as production-closed, while keeping whole-state nonlinear sharding and FFT-axis decomposition diagnostic until they have runtime communication, conservation, transport-window, and profiler-backed speedup gates. The status JSON also embeds the UQ/optimization ensemble provenance gate so the production independent-work lane is closed on ordering, worker clipping, exception metadata, and deterministic reconstruction, not only speedup and scalar identity.

The decomposition-contract gate below is the lower-level correctness ledger for parallel work partitioning. It confirms deterministic shard assignment and serial reconstruction identity for independent k_y and UQ portfolios, while labeling nonlinear state-domain partitioning as diagnostic metadata only.

SPECTRAX-GK parallel decomposition contract status

SPECTRAX-GK quasilinear UQ ensemble strong scaling

The quasilinear/UQ ensemble panel applies the same independent-worker policy to six late-time Cyclone ITG gradient samples and five k_y values per sample at Ny=96, Nz=64, Nl=3, Nm=6, and 2000 RK2 steps. It computes real linear growth/frequency fits and a reduced mixing-length feature observable, then checks exact serial identity. On ssh office, CPU process scaling reaches 5.41x on eight requested workers using six actual ensemble chunks, and the two-RTX-A4000 GPU run reaches 1.71x. This is a parallelization and UQ plumbing result, not a promoted absolute nonlinear heat-flux model.

Benchmarks

SPECTRAX-GK is validated against standard gyrokinetic benchmarks within the tracked release scope:

  • Linear growth rates, frequencies, and eigenfunctions: release-atlas cases including Cyclone ITG, ETG, KBM, W7-X, HSX, and shaped tokamak coverage.
  • Nonlinear transport windows: release-gated heat-flux and energy statistics for Cyclone, Cyclone Miller, KBM, W7-X, and HSX.

The benchmark tooling in tools/ ensures reproducibility and performance tracking. For the current release pass, the accepted nonlinear validation set is Cyclone, Cyclone Miller, KBM, W7-X, and HSX. Full-GK ETG nonlinear pilots, TEM/KAW stress lanes, kinetic-electron extensions, and W7-X zonal-flow recurrence/damping stay outside the active release parity claim unless a gate-indexed artifact promotes them explicitly. The window-statistics artifact uses case-specific mean-relative gates: KBM 0.02, HSX 0.05, Cyclone Miller 0.095, and the broader release envelope 0.10 for Cyclone and W7-X while their paper-level tightening lanes remain open.

Runtime and Memory

Runtime and memory comparison

SPECTRAX-GK is optimized for performance across CPU and GPU backends. The runtime panel above compares wall-time and peak memory usage for the shipped benchmark cases. Performance tracking covers:

  • Cyclone ITG (linear/nonlinear)
  • KBM and ETG configurations
  • W7-X and HSX stellarator geometries
  • Miller geometry models

The refreshed shipped panel includes the W7-X and HSX linear and nonlinear rows. Regenerate this public panel from the shipped refresh summary with:

python tools/benchmark_runtime_memory.py \
  --summary-glob tools_out/runtime_memory_summary_ship_refresh.json \
  --csv-out tools_out/runtime_memory_results_ship_refresh_regenerated.csv \
  --summary-out tools_out/runtime_memory_summary_ship_refresh_regenerated.json \
  --plot-out docs/_static/runtime_memory_benchmark.png

Experimental or not-yet-closed lanes such as KAW, TEM, and kinetic-electron Cyclone are tracked separately and do not appear in the shipped runtime panel. For the stellarator rows on office, the shipped panel uses pre-generated *.eik.nc geometry files rather than on-the-fly VMEC regeneration. The GX reference rows also run against a consistent local netcdf-c / hdf5 runtime stack there, because the default office stellarator environment mixed incompatible HDF5 / NetCDF libraries and lacked the VMEC Python helper dependencies needed for live geometry generation.

These shipped runtime rows are cold wall-time measurements, so the SPECTRAX-GK nonlinear GPU entries include JAX startup/compile cost. Targeted office GPU profiles on the same short nonlinear cases measured:

  • Cyclone nonlinear: warmup_time_s = 33.957, run_time_s = 15.054
  • KBM nonlinear: warmup_time_s = 27.485, run_time_s = 9.725

This means the current short-run Cyclone and KBM gaps are dominated much more by cold-start overhead than by steady-state timestep throughput. In steady state, Cyclone GPU is faster than the shipped GX runtime row, and KBM GPU is close to parity. The hollow diamond markers in the runtime subplot show those warm second-run timings on top of the cold wall-time bars.

Kernel profiling and gated fast modes

Nonlinear RHS kernel profile

The current profiler splits the nonlinear RHS into field solve, linear RHS, nonlinear bracket, and full RHS kernels on CPU and GPU. The latest bounded Cyclone profile shows the compiled linear RHS, nonlinear bracket, and full RHS are the dominant warm-throughput targets, while GPU execution reduces all measured RHS kernels. The companion JSON artifact records dominant kernels and grid-to-spectral speedups so the optimization lane remains traceable and machine-checkable.

The next profiler layer resolves the linear RHS into individual term kernels. The tracked Cyclone CPU artifact (docs/_static/linear_rhs_terms_profile.json) now includes the zero-collision fast path and linked-FFT refactor and reports full_linear_rhs=1.08e-1 s in the bounded CPU harness. The active-state companion (docs/_static/linear_rhs_terms_profile_z_wave_cpu.json) injects resolved parallel variation and reports full_linear_rhs=1.27e-1 s while showing linked-|k_z| hypercollisions becoming active; apart from the accepted zero-collision guard, zero-norm initial-state rows remain enabled until a state-window identity gate proves they remain inactive after nonlinear evolution. The matching office GPU artifact (docs/_static/linear_rhs_terms_profile_gpu.json) reports full_linear_rhs=5.50e-3 s on one RTX A4000, and the active-state GPU companion reports 5.48e-3 s while reproducing the linked-|k_z|/ hypercollision norm match.

The tracked state-window gate (docs/_static/linear_rhs_zero_norm_state_window_gate.json) now makes that policy executable: it accepts a zero-collision skip for the nu=0 Cyclone window but rejects skipping linked-|k_z| hypercollisions once a resolved parallel perturbation is present.

A larger Cyclone Miller companion profile is documented in docs/performance.rst and tracked as docs/_static/nonlinear_rhs_profile_miller.{png,json}. It uses Nx=192, Ny=64, Nz=24, Nl=4, Nm=8. After the grid-Laguerre einsum refactor, the matched one-GPU profile gives full_rhs=1.28e-2 s in grid mode and 1.48e-2 s in spectral mode. Spectral mode still reduces the GPU nonlinear bracket by about 1.63x, but the full-RHS timing is limited by the combined linear-RHS/bracket graph, so the next optimization target is linear-RHS fusion/cache layout before any broader nonlinear speedup claim. The matched W7-X/HSX runtime-mode stellarator profiler artifact (docs/_static/nonlinear_rhs_profile_stellarator_runtime.json) records W7-X and HSX GPU full-RHS calls near 2.7e-2 s versus CPU calls near 3.1e-1 s; those rows close the release-level performance evidence while keeping broader production nonlinear speedup claims scoped to future profiler-gated work.

The full fused linear-RHS trace artifact (docs/_static/full_linear_rhs_trace_summary.json) now records the Cyclone Miller graph-level profile after electrostatic field specialization: warm_seconds=8.09e-2, first compile+execute 1.40 s, and 2225 HLO lines. The matching pre-specialization local artifact had warm_seconds=1.19e-1 and 2425 HLO lines, so this is a bounded CPU graph-localization improvement, not a broad runtime claim. The active z_wave companion (docs/_static/full_linear_rhs_trace_z_wave_summary.json) uses the same specialized graph and reports warm_seconds=1.29e-1 after resolved parallel variation is injected; that timing is not promoted as a speedup until a matched GPU and nonlinear full-RHS profile is refreshed.

Spectral Laguerre mode gate

The optional spectral Laguerre nonlinear mode is gated, not a default. On the bounded local CPU and office GPU gates it preserves scalar nonlinear diagnostics across Cyclone, KBM, W7-X, and HSX. The refreshed CPU gate has maximum relative differences below 8.9e-4 and grid/spectral runtime ratios of 2.90, 3.31, 1.67, and 0.66 for Cyclone, KBM, W7-X, and HSX, respectively. The tracked GPU gate has maximum relative differences below 2.2e-5 and ratios 1.66, 2.69, 1.63, and 0.74. HSX is slower on both backends in these bounded gates, so users should treat spectral Laguerre mode as an opt-in engineering mode and rerun python tools/gate_laguerre_nonlinear_modes.py for their production case before relying on it for performance claims.

Regenerate the runtime figure from collected per-case summaries with:

python tools/benchmark_runtime_memory.py \
  --summary-glob tools_out/runtime_memory_*linear.json \
  --summary-glob tools_out/runtime_memory_*nonlinear.json

# For a long office sweep, keep going after a failed row and save per-row logs.
python tools/benchmark_runtime_memory.py --continue-on-error --log-dir tools_out/runtime_memory_logs

Parallelization scaling figures are kept in the performance docs rather than the top-level README. The shipped public claim is the independent-work path for k_y scans, quasilinear studies, sensitivity sweeps, and UQ ensembles; whole-state nonlinear sharding remains an identity/profiler artifact until a communication-aware nonlinear decomposition has matched CPU/GPU identity and throughput evidence.

Examples

The examples/ directory is organized by physics and configuration:

  • linear/: Linear microinstability drivers for axisymmetric (Tokamak) and non-axisymmetric (Stellarator) geometries.
  • nonlinear/: Nonlinear turbulence simulations and transport analysis.
  • benchmarks/: Scripts for replicating published benchmark results and parameter scans.
  • theory_and_demos/: Pedagogical examples and demonstrations of the underlying numerical methods.

Release-gated nonlinear example lanes include:

  • Cyclone ITG
  • Cyclone Miller
  • KBM
  • W7-X
  • HSX

A full-GK ETG nonlinear pilot lane is also available at examples/nonlinear/axisymmetric/runtime_etg_nonlinear.toml, but it remains a pilot until its benchmark operating point, observable contract, and gate-indexed artifact are promoted.

The reduced cETG example remains available as a separate reduced-model workflow; it is not the same thing as the full-GK ETG nonlinear lane.

Documentation

Comprehensive documentation, including theory, algorithms, and API references, is available in docs/.

Testing

Default pytest runs skip integration tests for faster feedback. Use:

pytest
pytest -m integration
python tools/run_tests_fast.py
python tools/run_wide_coverage_gate.py --shards 48 --timeout 300 --fail-under 95 --pytest-arg=-o --pytest-arg=addopts= --pytest-arg=-m --pytest-arg="not slow"

tools/run_tests_fast.py runs per-file pytest shards with a 300 s per-file timeout and a 300 s total local budget by default. Use --total-timeout 0 only when you explicitly want the full sequential local pass.

For laptops or shared workstations, run the same wide gate one bounded shard at a time with --only-shard N --keep-existing-coverage --skip-combine, then finish with --combine-only --fail-under 95. CI adds --require-shard-data --shard-manifest coverage-wide-shard-manifest.json so the final coverage badge cannot be refreshed from an incomplete shard upload. This keeps every local pytest process under the release timeout instead of launching one long run.

Plotting outputs

To visualize nonlinear diagnostics from a *.out.nc file:

python examples/utilities/plot_runtime_outputs.py tools_out/cyclone_nonlinear.out.nc \
  --out tools_out/cyclone_nonlinear_diagnostics.png

Contributing

SPECTRAX-GK is an open-source project welcoming contributions. Whether it's improving runtimes, reducing memory usage, or expanding the physics models, your help is appreciated.

License

MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spectraxgk-1.6.1.tar.gz (32.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spectraxgk-1.6.1-py3-none-any.whl (432.0 kB view details)

Uploaded Python 3

File details

Details for the file spectraxgk-1.6.1.tar.gz.

File metadata

  • Download URL: spectraxgk-1.6.1.tar.gz
  • Upload date:
  • Size: 32.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for spectraxgk-1.6.1.tar.gz
Algorithm Hash digest
SHA256 b5f4b55133b5b962480d959855da51f1e987e6cd8f4c4a76066988706443af39
MD5 154c60f88efcafd14b7e3d37d506ffb3
BLAKE2b-256 81532c7f29740a55b5970e2ada2788e7e735b95641a52d5d3e0f6666bf9d494e

See more details on using hashes here.

Provenance

The following attestation bundles were made for spectraxgk-1.6.1.tar.gz:

Publisher: release.yml on uwplasma/SPECTRAX-GK

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file spectraxgk-1.6.1-py3-none-any.whl.

File metadata

  • Download URL: spectraxgk-1.6.1-py3-none-any.whl
  • Upload date:
  • Size: 432.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for spectraxgk-1.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6ec1aac58226c0075b18a1e44920e4c7acd57d0926bce25891bcff67ecd8bb31
MD5 260e44f1671df6e8506c78fc41c25429
BLAKE2b-256 a83ae792e50d9ac500a1486e38716faa7a17ccfa5befde1b7bc20c405e541d4f

See more details on using hashes here.

Provenance

The following attestation bundles were made for spectraxgk-1.6.1-py3-none-any.whl:

Publisher: release.yml on uwplasma/SPECTRAX-GK

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page