Add your description here
Project description
sarasa
A minimum LLM training framework built on pure PyTorch with simplicity and extensibility.
Installation
uv sync [--extra cpu|cu128|cu130] [--extra flash_attn]
or
uv add sarasa[cpu|cu128|cu130]
Features
-
Pure PyTorch implementation
-
Flexible configuration system with command-line overrides
-
Support from a single GPU to multiple GPUs (simple DDP and FSDP for now)
-
Selective activation checkpointing (SAC) for memory efficiency
-
Async distributed checkpoint saving
-
Checkpoint loading
Usage
It's (almost) ready to use. First, set up tokenizer, e.g.,
mkdir tokenizer
cd tokenizer
uvx hf download --local-dir . --include "tokenizer*" "meta-llama/Llama-3.1-8B"
Then, the following command starts training of a GPT model on FineWeb-edu with a single or multiple GPUs.
uv run torchrun --nproc_per_node="gpu" main.py \
--config-file configs/example.py \
[--train.local-batch-size 8 ...] # override config options as needed
Extending with Custom Components
Extending Sarasa is as simple as defining your own configuration dataclasses with create methods for custom models, optimizers, data loaders, etc.
Here's an example of using a custom optimizer:
from sarasa import Trainer, Config
class CustomOptimizer(torch.optim.Optimizer):
...
class CustomOptim:
lr: float = ...
def create(self,
model: torch.nn.Module
) -> torch.optim.Optimizer:
return CustomOptimizer(model.parameters(), lr=self.lr, ...)
class CustomOptim2:
lr: float = ...
def create(self,
model: torch.nn.Module
) -> torch.optim.Optimizer:
return CustomOptimizer(model.parameters(), lr=self.lr, ...)
if __name__ == "__main__":
config = Config.from_cli(optim_type=CustomOptim | CustomOptim2)
trainer = Trainer(config)
trainer.train()
From the command line, you can specify which custom optimizer to use:
python script.py optim:custom_optim --optim.lr 0.001 ...
Config File Example
It's very simple. IDE autocompletion will help you.
from sarasa.config import Config, Data, LRScheduler, Model, Train, LRScheduler
from custom_optim import CustomOptim
# only one Config instance should be defined in each config file
config = Config.create(
model=Model(num_layers=12),
train=Train(
local_batch_size=16,
global_batch_size=256,
dtype="bfloat16",
),
optim=CustomOptim(lr=0.001),
lr_scheduler=LRScheduler(
decay_type="linear",
warmup_steps=1000,
total_steps=100000,
),
data=Data(tokenizer_path="./tokenizer"),
seed=12,
)
Acknowledgements
This project is heavily inspired by and borrows code from torchtitan.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sarasa-0.0.3.tar.gz.
File metadata
- Download URL: sarasa-0.0.3.tar.gz
- Upload date:
- Size: 28.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffaaf99fc91c4d532117dfcd943df4fd90d9c9ea342bb6d4315030c4804bddb4
|
|
| MD5 |
107dc74c0438373841000fd4cd7381fe
|
|
| BLAKE2b-256 |
c207821f4b15b901ae77ecead018cc1047a74e6c635fdb758f97c106e5a921e9
|
File details
Details for the file sarasa-0.0.3-py3-none-any.whl.
File metadata
- Download URL: sarasa-0.0.3-py3-none-any.whl
- Upload date:
- Size: 31.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc0247ed6c0bb619554ab4714421d59ee5b82043c948bcb894d7a6d305de4c48
|
|
| MD5 |
6e72109794aaa5bba4a29aec6d026845
|
|
| BLAKE2b-256 |
08db95ca00794a7bf2876d69114912b1e32da12f8e1892b2b487bba7d0347ae2
|