A fast gradient checkpointing strategy for training with memory-efficient attention (e.g., FlashAttention).
Project description
FastCkpt: accelerate your LLM training in one line!
Fast gradient checkpoint is designed for accelerate the training with memory-efficient attention like FlashAttention and LightSeq. FastCkpt has monkey patch for both rematerialization-aware checkpointing and FlashAttention, so you can patch both in only one line!
Paper: https://arxiv.org/pdf/2310.03294.pdf
News
- [2023/10] FastCkpt now supports LlamaModel in Huggingface!
Install
pip install fastckpt
Usage
FastCkpt now supports HF training pipeline.
Use FaskCkpt and FlashAttention
To use fasckpt with flash_attn, import and run replace_hf_ckpt_with_fast_ckpt before importing transformers
# add monkey patch for fastckpt
from fastckpt.llama_flash_attn_ckpt_monkey_patch import replace_hf_ckpt_with_fast_ckpt
replace_llama_attn_with_flash_attn()
# import transformers and other packages
import transformers
...
Use FlashAttention only
To only replace the LlamaAttention with flash_attn without chaning the checkpointing strategy, import and run replace_llama_attn_with_flash_attn
# add monkey patch for fastckpt
from fastckpt.llama_flash_attn_monkey_patch import replace_llama_attn_with_flash_attn
replace_llama_attn_with_flash_attn()
# import transformers and other packages
import transformers
...
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastckpt-0.0.4.tar.gz.
File metadata
- Download URL: fastckpt-0.0.4.tar.gz
- Upload date:
- Size: 14.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
929ab4790bdd4f0969a4ab32f839218db9bc08ad02e341cf36eb1fbe4c56a118
|
|
| MD5 |
5155645c58f7da21fac12df6cd200a6d
|
|
| BLAKE2b-256 |
b690d6395f686879e90245060d2cd5d56ea0af66eda26c1f75be7c5ade604165
|
File details
Details for the file fastckpt-0.0.4-py3-none-any.whl.
File metadata
- Download URL: fastckpt-0.0.4-py3-none-any.whl
- Upload date:
- Size: 13.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d06ed4b6d967c99ee147fa197032bb0de4ec58e3daf46557ffdda41a1a6d28e7
|
|
| MD5 |
bd8b0e70ad4ddb94e946e0b0758bac9a
|
|
| BLAKE2b-256 |
b846c77993135699251e407d5f557059d586ee9117e9ac9ae2b3cc16dd1f2edf
|