Skip to main content

Advanced neural network optimizers: Axiom (73% memory savings) + Verity (bit-perfect cross-hardware reproducibility)

Project description

QuarterBit AXIOM

High-Performance Quantized Optimizer for PyTorch

Drop-in replacement for AdamW with significant memory savings.

PyPI


Installation

pip install quarterbit

Requirements:

  • Python 3.8+
  • PyTorch 1.8+
  • NVIDIA GPU with CUDA

Supported GPUs:

  • Consumer: GTX 1650+, RTX 20/30/40 series
  • Data Center: T4, V100, A10, A100, L4, L40, H100, H200

Quick Start

from quarterbit import Axiom

optimizer = Axiom(model.parameters())

for batch in dataloader:
    loss = model(batch)
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

Optimizer Parameters

AXIOM comes with tuned defaults. Advanced users can adjust:

from quarterbit import Axiom

optimizer = Axiom(
    params,                    # Model parameters
    lr=5e-4,                   # Learning rate
    betas=(0.9, 0.999),        # Momentum coefficients
    eps=1e-8,                  # Numerical stability
    weight_decay=0.1,          # Decoupled weight decay
    total_steps=10000,         # Training steps (for scheduling)
    warmup_ratio=0.1,          # Warmup fraction
)

Per-Layer Learning Rates

optimizer = Axiom([
    {'params': model.backbone.parameters(), 'lr': 1e-4},
    {'params': model.head.parameters(), 'lr': 5e-4},
], weight_decay=0.1)

Activation Checkpointing (Pro)

Custom compressed checkpointing with slot-based storage.

from quarterbit.torch.utils import ActivationCheckpoint

# Create checkpoint storage
actcp = ActivationCheckpoint(
    max_slots=24,              # Number of layers/checkpoints
    max_elements=2**20,        # Max elements per activation
)

# During forward pass - store activations
actcp.store(activation, slot=layer_idx)

# During backward pass - restore activations
restored = actcp.restore(slot=layer_idx)

# Check memory savings
info = actcp.memory_info()
print(f"Memory saved: {info['savings_pct']:.1f}%")

# Clear when done
actcp.clear()

Parameters:

Parameter Description
max_slots Number of checkpoint slots (typically num_layers)
max_elements Maximum tensor size per slot

Methods:

Method Description
store(tensor, slot) Compress and store activation
restore(slot) Decompress and return activation
clear(slot=None) Clear one slot or all
memory_info() Get memory usage statistics

Gradient Compression (Pro)

Drift-free compression with error feedback for distributed training.

from quarterbit.torch.utils import GradientCompressor

# Create compressor for each parameter
compressor = GradientCompressor(num_elements=param.numel())

# Compress before all-reduce
compressed = compressor.compress(param.grad)

# ... communicate compressed gradients ...

# Decompress after communication
param.grad = compressor.decompress(compressed)

# Reset error feedback (optional, between epochs)
compressor.reset()

Parameters:

Parameter Description
num_elements Number of gradient elements

Methods:

Method Description
compress(grads) Compress with error feedback
decompress(compressed) Decompress to FP32
reset() Clear error feedback accumulator

Multi-GPU Training

DataParallel

model = torch.nn.DataParallel(model)
optimizer = Axiom(model.parameters())

DistributedDataParallel

from torch.nn.parallel import DistributedDataParallel as DDP

model = DDP(model, device_ids=[local_rank])
optimizer = Axiom(model.parameters())

DeepSpeed

import deepspeed

model, optimizer, _, _ = deepspeed.initialize(
    model=model,
    optimizer=Axiom(model.parameters()),
    config=ds_config
)

Checkpointing

# Save
torch.save({
    'model': model.state_dict(),
    'optimizer': optimizer.state_dict(),
    'step': step,
}, 'checkpoint.pt')

# Load
ckpt = torch.load('checkpoint.pt')
model.load_state_dict(ckpt['model'])
optimizer.load_state_dict(ckpt['optimizer'])

Licensing

quarterbit activate <LICENSE_KEY>
Tier GPU Hours Features
Free 10/month Optimizer only
Pro 1,000/month + Checkpointing, Compression
Enterprise Unlimited + On-premise, Custom SLA

Get your key: quarterbit.dev/pricing


Links


Copyright (c) 2026 Clouthier Simulation Labs

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

quarterbit-8.0.2-cp312-cp312-manylinux_2_38_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.38+ x86-64

quarterbit-8.0.2-cp311-cp311-win_amd64.whl (2.2 MB view details)

Uploaded CPython 3.11Windows x86-64

File details

Details for the file quarterbit-8.0.2-cp312-cp312-manylinux_2_38_x86_64.whl.

File metadata

File hashes

Hashes for quarterbit-8.0.2-cp312-cp312-manylinux_2_38_x86_64.whl
Algorithm Hash digest
SHA256 2687a8a05bf40189eace65780641897c0029eb0c2aa0665d1719c6df5fde7bb4
MD5 fa4ac11cfe0dbccc246152c6a98c5e77
BLAKE2b-256 9c291a816e9e755aba86e5b7a54f36c669d7a4eb34369d77a135ac47371d9c10

See more details on using hashes here.

File details

Details for the file quarterbit-8.0.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: quarterbit-8.0.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for quarterbit-8.0.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 37878635eddfa4d36eab7a0a77057e748e789cc1723f537137731c8aa9abf81c
MD5 4294eb449bdb7b550d1d5ffd8d5ba73d
BLAKE2b-256 0d3333010d03b3b8c78fc0128c3826545f7e3803911263eab52686130830a39b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page