Skip to main content

Gefen optimizer for memory-efficient PyTorch training

Project description

Gefen: Optimized Stochastic Optimizer

Gefen is a drop-in replacement for the AdamW optimizer for memory-efficient training. It keeps the familiar AdamW training recipe while dramatically reducing optimizer-state memory: an 8x reduction in AdamW memory footprint, or about 6.5 GiB saved per billion parameters, while maintaining AdamW-level performance. The reduced memory footprint lets you train larger models or use larger batch sizes and, as a result, achieve higher training throughput. All it takes is changing two lines of code: import Gefen and replace the AdamW optimizer constructor.

Installation

Install from source:

git clone https://github.com/ndvbd/Gefen
cd Gefen
pip install -e .

On the first CUDA run, Gefen builds its fused CUDA kernels with PyTorch JIT and nvcc. This can take a few minutes. Later runs reuse the cached build for the same Python, PyTorch, CUDA version, and Gefen source checkout.

This keeps the source install lightweight, but it requires a CUDA toolkit and host compiler compatible with your PyTorch installation. In the future, we plan to make this smoother with prebuilt wheels for common PyTorch/CUDA combinations.

Quick Start

import torch
from gefen import Gefen

device = "cuda" if torch.cuda.is_available() else "cpu"
model = torch.nn.Linear(128, 10).to(device)

# optimizer = torch.optim.AdamW(
optimizer = Gefen(  # Replace AdamW with Gefen:
    model.parameters(),
    lr=1e-3,
    betas=(0.9, 0.999),
    eps=1e-8,
    weight_decay=0.0,
)

inputs = torch.randn(32, 128, device=device)
targets = torch.randint(0, 10, (32,), device=device)

logits = model(inputs)
loss = torch.nn.functional.cross_entropy(logits, targets)
loss.backward()

optimizer.step()
optimizer.zero_grad(set_to_none=True)

print('Finished successfully.')

Citation

@article{benedek2026gefen,
  title={Gefen: Optimized Stochastic Optimizer},
  author={Benedek, Nadav and Koren, Tomer and Fried, Ohad},
  journal={arXiv preprint arXiv:2606.13894},
  year={2026}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gefen-1.0.0.tar.gz (24.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gefen-1.0.0-py3-none-any.whl (32.6 kB view details)

Uploaded Python 3

File details

Details for the file gefen-1.0.0.tar.gz.

File metadata

  • Download URL: gefen-1.0.0.tar.gz
  • Upload date:
  • Size: 24.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for gefen-1.0.0.tar.gz
Algorithm Hash digest
SHA256 32186585fda407e613c6786069310470f40c3bcc2335e7d26db5c8306accc729
MD5 42532b9591afcf25bf03d243b7cbbf04
BLAKE2b-256 0845198c558786049b8de44616249c9b33178407bacae3ad60d13024d128853b

See more details on using hashes here.

File details

Details for the file gefen-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: gefen-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 32.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for gefen-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e2257739a0f661f89a55049878cd29086a44ef8f721ec986b8e7de801f37c00b
MD5 9197c39df53af0bbe639d4a42e8957d8
BLAKE2b-256 becd215dc4b6a19e3b86fc49621c6294de25708ba684e0dd78e6710495ff9c9f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page