Typed PyTorch training utilities for checkpoints, scheduling, freezing, memory planning, and structured losses.
Project description
trainkit-vp
trainkit-vp collects typed training utilities for PyTorch projects: checkpointing, staged freezing, loss aggregation, scheduler setup, and memory planning.
PyPI package name:
pip install trainkit-vp
Import name:
import trainkit
The package is intentionally generic. It is not tied to a single model family, and it is useful both in research code and in reusable application packages.
What is included
trainkit-vp currently provides:
- checkpoint save/load helpers
- staged freezing helpers for hierarchical models
- a structured multi-level sequence loss
- cosine warmup scheduling
- accelerator-aware memory estimation and batch-size planning
Installation
Requirements:
- Python
>=3.14 - PyTorch
>=2.0
Install from PyPI:
pip install trainkit-vp
Checkpoint helpers
Public functions:
save_checkpointload_checkpoint
These wrap torch.save and torch.load behind a typed, small surface that downstream packages can standardize around.
Typical usage:
from pathlib import Path
from trainkit import load_checkpoint, save_checkpoint
payload = {
"epoch": 3,
"model_state": model.state_dict(),
"optimizer_state": optimizer.state_dict(),
}
path = Path("artifacts/best.pt")
save_checkpoint(path, payload)
restored = load_checkpoint(path)
Freezing helpers
Public functions:
set_requires_gradfreeze_stages
set_requires_grad is a simple utility for enabling or disabling training on a module.
freeze_stages is aimed at hierarchical setups. In the current code it knows how to freeze universal, family, language, or joint stages for a model exposing a CELMoE-style level_modules layout.
That makes it useful for:
- staged pretraining
- late unfreezing
- expert specialization phases
Multi-level sequence loss
The main structured loss class is MultiLevelSequenceLoss.
It expects a model output dictionary containing:
logitsuniversal_logitsfamily_logitslanguage_logits
It returns a LossBreakdown dataclass with:
totalfinaluniversalfamilylanguage
Example:
from chartoken import PAD
from trainkit import MultiLevelSequenceLoss
criterion = MultiLevelSequenceLoss(
pad_id=PAD,
final_weight=1.0,
universal_weight=0.25,
family_weight=0.35,
language_weight=0.5,
)
breakdown = criterion(output, target_ids)
print(breakdown.total)
print(breakdown.as_scalars())
Scheduler helper
create_cosine_schedule builds a linear-warmup cosine-decay schedule on top of torch.optim.lr_scheduler.LambdaLR.
Example:
from trainkit import create_cosine_schedule
scheduler = create_cosine_schedule(
optimizer,
warmup_steps=500,
total_steps=12000,
)
Memory planning utilities
Public functions:
format_bytesdetect_memoryestimate_model_memoryestimate_training_vramsuggest_batch_sizeplan_training
Main dataclasses:
MemoryProfileTrainingPlan
These helpers are intended for early experiment planning rather than exact runtime accounting. They give fast estimates for:
- available system RAM
- available accelerator memory
- approximate model memory footprint
- approximate training VRAM requirements
- a suggested batch size
The implementation includes typed wrappers around CUDA, XPU, and MPS memory queries so static analyzers remain happy in strict mode.
Example workflow
from trainkit import detect_memory, plan_training
profile = detect_memory()
plan = plan_training(
model,
sequence_length=96,
batch_size=64,
optimizer_states_factor=2.0,
)
print(profile)
print(plan)
Scope boundaries
trainkit-vp does not define:
- your model architecture
- your dataset format
- your optimizer choice
- distributed training logic
Instead it provides small, reliable pieces that application packages can compose however they want.
Why publish it separately
Separating training utilities into their own package keeps:
- model packages smaller
- checkpoint format decisions reusable
- type contracts stable across projects
- experimentation code less duplicated
In this repository that matters because libraries are published independently and should remain usable outside morphoformer.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trainkit_vp-2.3.5.tar.gz.
File metadata
- Download URL: trainkit_vp-2.3.5.tar.gz
- Upload date:
- Size: 11.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b0b1cf9d14e488c2de1b83de503d6657f7c92f6c180c1f7f2dadb790c97b96a
|
|
| MD5 |
ded61d5cd05b6546a60922ecd6ce6e4c
|
|
| BLAKE2b-256 |
37044cca9763e0313e42672f3a30402c8a605a4b4a38d1af46ed8b931eba4318
|
File details
Details for the file trainkit_vp-2.3.5-py3-none-any.whl.
File metadata
- Download URL: trainkit_vp-2.3.5-py3-none-any.whl
- Upload date:
- Size: 10.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6ba4a340d6da3833e9fe8a9251d53f556a2002dd6ed3f432d276bb3a2a34e04
|
|
| MD5 |
0faaf8e747d57ae762ab2affe0c58432
|
|
| BLAKE2b-256 |
639805a630981df6313ccc3379ff23d91e7eb033302bf435a0b05f2bad9327e8
|