Skip to main content

PyTorch training utilities: cosine LR with warmup, VRAM profiling, automatic batch size and gradient accumulation planning

Project description

trainkit-vp

PyTorch training utilities: LR scheduling, VRAM profiling, and automatic batch planning.

Part of the MorphFormer project by Voluntas Progressus.

Installation

pip install trainkit-vp

Requires Python >= 3.14 and PyTorch >= 2.0.

Features

  • Cosine LR scheduler with linear warmup (create_cosine_schedule)
  • Memory profiling — detect system RAM and GPU VRAM across CUDA, ROCm, XPU, MPS backends
  • Automatic training plan — selects batch size, gradient accumulation, gradient checkpointing, and AMP based on available VRAM
  • VRAM estimation heuristics for Transformer models (parameters + activations + optimizer state)

Quick Start

from trainkit import detect_memory, plan_training, create_cosine_schedule, estimate_model_memory

# Profile system memory
profile = detect_memory()
profile.print_info()

# Auto-plan training based on available VRAM
model_bytes = estimate_model_memory(model)
plan = plan_training(profile, model_bytes, batch_size=64, max_len=96, d_model=512, total_layers=14)
print(plan.reason)

# Create LR schedule with warmup
scheduler = create_cosine_schedule(optimizer, warmup_steps=4000, total_steps=100000)

API

Function / Class Description
create_cosine_schedule(optimizer, warmup, total) Cosine annealing with linear warmup
detect_memory() Returns MemoryProfile with RAM/VRAM info
plan_training(profile, ...) Returns TrainingPlan with recommended settings
estimate_model_memory(model) Estimate model parameter memory in bytes
estimate_training_vram(...) Estimate total VRAM needed for training
suggest_batch_size(...) Suggest optimal batch size for available VRAM
format_bytes(n) Human-readable byte formatting
MemoryProfile Dataclass with system memory info
TrainingPlan Dataclass with recommended training config

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trainkit_vp-1.1.0.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trainkit_vp-1.1.0-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file trainkit_vp-1.1.0.tar.gz.

File metadata

  • Download URL: trainkit_vp-1.1.0.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for trainkit_vp-1.1.0.tar.gz
Algorithm Hash digest
SHA256 643b0b044ddfd68c6d434c5368dc9b357bb215a315e8a8608e931cdee8570e60
MD5 6dcee844f3c8c6894a5f206bb15ead4d
BLAKE2b-256 93d4bd6d9adca962836bce722c848923124396e9e88d789a5a718846bd414415

See more details on using hashes here.

File details

Details for the file trainkit_vp-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: trainkit_vp-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for trainkit_vp-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a8c185a4ad774bb25a6303528ddca70e27e8a3b510f3ac684519c9c816bc748c
MD5 bf59663642413415fa54c408380cfc62
BLAKE2b-256 2b92635196c538af3d592ad5e5d1f4f4ad128c5d27e0350478e5f33d10dfe675

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page