JAX/Flax/Optax optimizer manager

These details have not been verified by PyPI

Project links

Project description

OptTx

Research Code: Co-developed with Claude Code, Gemini CLI, Codex CLI, and Cursor. No guarantees provided. Use at your own risk.

JAX/Flax/Optax optimizer library for PINNs and second-order methods.

Features

Multi-term objectives: Objective with TermSpec for PINNs (PDE, BC, IC terms)
First-order optimizers: Adam, SGD, AdamW, SOAP, MUON, Shampoo, L-BFGS
Second-order optimizers: CGOptimizer (Fisher/GGN), CROptimizer (Hessian)
Acceleration methods: TGS, NLTGCR, Anderson Acceleration (AA)
Graph neural networks: GCN, GAT layers for node classification
Matrix-free curvature: build_hessian_matvec, build_fisher_matvec
JIT-stable: Works with jax.jit and jax.lax.scan

Install

pip install opttx

For development:

pip install -e .[dev]

Quickstart

First-order optimizer

import jax
import jax.numpy as jnp
from flax import linen as nn

from opttx import Adam, Objective, TermSpec, TrainState

# Define model
class MLP(nn.Module):
    @nn.compact
    def __call__(self, x):
        x = nn.Dense(32)(x)
        x = nn.relu(x)
        x = nn.Dense(1)(x)
        return x

# Define loss
def mse_loss(pred, batch):
    x, y = batch
    return jnp.mean((pred - y) ** 2)

# Create objective
term = TermSpec(name="mse", batch_key="data", loss_fn=mse_loss)
objective = Objective(terms=[term])

# Initialize
model = MLP()
params = model.init(jax.random.PRNGKey(0), jnp.ones((1, 3)))["params"]

state = TrainState(
    step=jnp.array(0),
    params=params,
    opt_state=None,
    apply_fn=lambda v, b: model.apply({"params": v["params"]}, b[0]),
)

# Create optimizer and train
optimizer = Adam(objective, learning_rate=1e-3)
state = optimizer.init(state)

batch = {"data": (jnp.ones((8, 3)), jnp.zeros((8, 1)))}
state, metrics = optimizer.step(state, batch)
print(f"Loss: {metrics['loss']}")

Second-order optimizer (CR + Hessian)

from opttx import CROptimizer

optimizer = CROptimizer(
    objective,
    learning_rate=1.0,
    damping=1e-3,
    cr_iters=10,
    curvature_type="hessian",  # or "fisher"
)
state = optimizer.init(state)
state, metrics = optimizer.step(state, batch)

Multi-term objective (PINNs)

def pde_loss(pred, batch):
    return jnp.mean(pred ** 2)

def bc_loss(pred, batch):
    return jnp.mean(pred ** 2)

pde_term = TermSpec(name="pde", batch_key="x_pde", loss_fn=pde_loss)
bc_term = TermSpec(name="bc", batch_key="x_bc", loss_fn=bc_loss)

objective = Objective(
    terms=[pde_term, bc_term],
    loss_weights={"pde": 1.0, "bc": 0.1},
)

batch = {
    "x_pde": jnp.ones((100, 2)),
    "x_bc": jnp.ones((20, 2)),
}

Dynamic hyperparameters (JIT-friendly)

Learning rate, damping, weight decay and CG/CR tolerance can change during a jax.jit-compiled run without recompilation. Two mechanisms share one resolution rule: override > schedule > plain float.

Schedules — pass a Callable(step) -> scalar (any Optax schedule works, or the built-in warmup_schedule):

import optax
from opttx import Adam, warmup_schedule

opt = Adam(objective, learning_rate=optax.cosine_decay_schedule(1e-3, decay_steps=10_000))
opt = Adam(objective, learning_rate=warmup_schedule(1e-3, warmup_steps=500))

Runtime overrides — pass a flat dict as the third argument to step; the values are traced as jit inputs, so a sweep or a plateau controller runs on a single compilation:

jit_step = jax.jit(opt.step)
for lr in [1e-2, 1e-3, 1e-4]:          # no recompilation across values
    state, metrics = jit_step(state, batch, {"learning_rate": lr})

Second-order optimizers additionally accept damping (and cg_tol / cr_tol):

opt = CGOptimizer(objective, learning_rate=1.0, damping=1e-3, curvature_type="fisher")
jit_step = jax.jit(opt.step)  # re-jit: jit_step above is bound to the Adam step
state, metrics = jit_step(state, batch, {"damping": 1e-2, "cg_tol": 1e-6})

Each optimizer exposes its runtime-adjustable knobs via DYNAMIC_HPARAMS. Structural knobs (cg_iters, memory_size, ns_steps, max_precond_dim, curvature_type, ...) stay static and are rejected fast if passed as an override or schedule. OptaxOptimizer supports overrides when its transform is built with optax.inject_hyperparams; LBFGSOptimizer exposes none (its step size is line-search controlled).

Effective-value logging — metrics carries hparams/learning_rate, hparams/damping, etc., and the objective logs raw per-term losses (loss/<term>) alongside effective per-term weights (weight/<term>), so raw terms, their weighting, and the optimizer knobs can be plotted separately.

Step-reset hazards (staged optimization) — a schedule keyed on state.step stays continuous when you hand state from one optimizer to another, because the global step keeps advancing. Two optimizer-internal clocks do not follow state.step, though: calling optimizer.init(state) resets the wrapped optax count for LBFGSOptimizer (its L-BFGS curvature memory restarts) and any OptaxOptimizer transform built with a native optax schedule (that schedule advances on optax's own count, not on state.step). Prefer OptTx's Callable(step) schedules or runtime hparams when you need a knob tied to the global step across a staged hand-off.

See examples/dynamic_lr.py for a full walkthrough including a cosine schedule, a no-recompile LR sweep, and staged optimization.

API Reference

Optimizers

Optimizer	Description
`Adam`	Adam optimizer
`SGD`	SGD with momentum
`AdamW`	Adam with weight decay
`SOAP`	Second-order approximation
`MUON`	Momentum with orthogonalization
`Shampoo`	Shampoo preconditioner
`LBFGSOptimizer`	L-BFGS quasi-Newton
`CGOptimizer`	Conjugate Gradient (Fisher/GGN)
`CROptimizer`	Conjugate Residual (Hessian)
`TGSOptimizer`	TGS acceleration
`TGSAccelerator`	TGS wrapper for any optimizer
`AAAccelerator`	Anderson Acceleration wrapper
`NLTGCROptimizer`	Nonlinear truncated GCR

Curvature

Function	Description
`build_hessian_matvec`	Matrix-free Hessian-vector product
`build_fisher_matvec`	Matrix-free Fisher/GGN-vector product
`build_damped_matvec`	Add damping: (H + λI)v

Solvers

Function	Description
`cg_solve`	Conjugate Gradient solver
`cr_solve`	Conjugate Residual solver
`tgs_solve_fori`	TGS solver (JIT-compatible)
`nltgcr_solve_fori`	NLTGCR solver (JIT-compatible)

Models

Model	Description
`GCN`	Graph Convolutional Network
`GCNLayer`	Single GCN layer
`GAT`	Graph Attention Network
`GATLayer`	Single GAT layer
`normalize_adjacency`	Symmetric adjacency normalization

Design Constraints

state.step must be a scalar jax.Array (never Python int)
Metrics have static string keys and scalar values
Must include "loss" key in metrics
Multi-term + batch_stats is not supported

Citation

If you use OptTx in your research, please cite the following papers:

Anderson Acceleration with Truncated Gram-Schmidt (SIMAX 2024)

@article{tang2024anderson,
  title={Anderson Acceleration with Truncated Gram-Schmidt},
  author={Tang, Ziyuan and Xu, Tianshi and He, Huan and Saad, Yousef and Xi, Yuanzhe},
  journal={SIAM Journal on Matrix Analysis and Applications},
  volume={45},
  number={4},
  pages={1850--1872},
  year={2024},
  doi={10.1137/24M1648600}
}

Designing Preconditioners for SGD (arXiv 2025)

@misc{scott2025designing,
  title={Designing Preconditioners for SGD: Local Conditioning, Noise Floors, and Basin Stability},
  author={Scott, Mitchell and Xu, Tianshi and Tang, Ziyuan and Pichette-Emmons, Alexandra and Ye, Qiang and Saad, Yousef and Xi, Yuanzhe},
  year={2025},
  eprint={2511.19716},
  archivePrefix={arXiv}
}

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0a2 pre-release

Jul 4, 2026

0.1.0a1 pre-release

Dec 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opttx-0.1.0a2.tar.gz (61.5 kB view details)

Uploaded Jul 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

opttx-0.1.0a2-py3-none-any.whl (87.5 kB view details)

Uploaded Jul 4, 2026 Python 3

File details

Details for the file opttx-0.1.0a2.tar.gz.

File metadata

Download URL: opttx-0.1.0a2.tar.gz
Upload date: Jul 4, 2026
Size: 61.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for opttx-0.1.0a2.tar.gz
Algorithm	Hash digest
SHA256	`a2a239c334bede5726b1a0d3b44c47c747995d4c400ed5eb2d7ad5d72a92ed4f`
MD5	`2b40c7a529e878af8a24afef0b1b9b4b`
BLAKE2b-256	`a7df61eacc330415378bca0146d68f52db8680d461b87e5bf37263d999019063`

See more details on using hashes here.

File details

Details for the file opttx-0.1.0a2-py3-none-any.whl.

File metadata

Download URL: opttx-0.1.0a2-py3-none-any.whl
Upload date: Jul 4, 2026
Size: 87.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for opttx-0.1.0a2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1b625777561fc47e777b7c76db4f6287fdb9c3be0bd8d30dbfd207655fc67ea7`
MD5	`6431c1cf952b249a944f2d662640632b`
BLAKE2b-256	`4e87ee19701e0392f7f356327edc6a4a5566f88ab575a210b718d3b4dc8e877a`

See more details on using hashes here.

opttx 0.1.0a2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OptTx

Features

Install

Quickstart

First-order optimizer

Second-order optimizer (CR + Hessian)

Multi-term objective (PINNs)

Dynamic hyperparameters (JIT-friendly)

API Reference

Optimizers

Curvature

Solvers

Models

Design Constraints

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes