TILEARN for LLM
Project description
Tilearn.llm使用说明
1. CUDA Kernel(以LLAMA为例)
支持显卡:Ampere, Ada, or Hopper GPUs (e.g., A100, A800, H100, H800)
Dependencies: pytorch >= 2.0.0
当前版本完全兼容huggingface接口,不需要额外的操作
LLAMA1/LLAMA2 A800 16GPU seq=1024相比deepspeed zero2训练加速约20%
cuda kernel使用方法-代码修改如下
### TILEARN.LLM
from tilearn.llm.transformers import LlamaForCausalLM
### 模型接口与标准huggingface一致
model = LlamaForCausalLM.from_pretrained(...)
或者使用AutoModelForCausalLM接口
### TILEARN.LLM
from tilearn.llm.transformers import AutoModelForCausalLM
### 模型接口与标准huggingface一致
model = AutoModelForCausalLM.from_pretrained(...)
特殊说明:
1、由于baichuan1 13B和baichuan2 13B会产生冲突,目前tilearn.llm.transformers.AutoModelForCausalLM默认开启了baichuan1 13B,如果需要使用baichuan2 13B,需要在启动训练脚本中设置环境变量:export TILEARN_LLM_BAICHUAN_13B=2
### TILEARN_LLM_BAICHUAN_13B open baichuan2 model
export TILEARN_LLM_BAICHUAN_13B=2
2、目前加速已经支持的模型列表:
# llama
from tilearn.llm.transformers.models.llama.modeling_llama import LlamaForCausalLM
# bloom
from tilearn.llm.transformers.models.bloom.modeling_bloom import BloomForCausalLM
# baichuan1
from tilearn.llm.transformers.models.baichuan.baichuan1_13B.modeling_baichuan import BaichuanForCausalLM
from tilearn.llm.transformers.models.baichuan.baichuan1_7B.modeling_baichuan import BaiChuanForCausalLM
# baichuan2
# 默认使用TILEARN.LLM且无需任何设置
# 单独使用xformer,需安装xformer且设置环境变量TIACC_TRAINING_CUDA_KERNEL=2
from tilearn.llm.transformers.models.baichuan.baichuan2_7B.modeling_baichuan import BaichuanForCausalLM
from tilearn.llm.transformers.models.baichuan.baichuan2_13B.modeling_baichuan import BaichuanForCausalLM
2. Static Zero
适用场景:在deepspeed zero1、zero2、zero3、offload、int8等不同优化状态间切换
启动脚本修改如下
### TILEARN STATIC ZERO
### Open: TIACC_TRAINING_CUDA_KERNEL='O2'
### support 'O2' / 'O2.5' / 'O3' / 'O3.5' / 'O3_Q8'(doing)
### Close: TIACC_TRAINING_CUDA_KERNEL='None'
export TIACC_TRAINING_STATIC_ZERO='None' #'O2'
代码修改如下
from transformers import HfArgumentParser
from tilearn.llm.transformers import TrainingArguments
### 接口与标准huggingface一致
parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distributions
Close
Hashes for tilearn_llm-0.7.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b4a0e8c832957c7a349b1e39c794b6436db16c8d1b880c84749c067202e8e110 |
|
MD5 | 2e7e9a51a5b29b5a3e463d3ea5e34261 |
|
BLAKE2b-256 | 2087756e2d35f04df6cdee018b1da07fb4e16d6d709e13b9619c9e53b4473b19 |
Close
Hashes for tilearn_llm-0.7.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f80876578b07f6b83eb7fd0014619b37d660309ac9434860b9767e6020f6475 |
|
MD5 | 552491ef1773d39372cb405630f8863f |
|
BLAKE2b-256 | 2ec21898c3070a8f4bcd0fafed13bd9b1e0925de42a2973c7071563cffc3a45c |
Close
Hashes for tilearn_llm-0.7.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ff6b3e2a165862fc9c205422758f4c1a56451233a53479e23f03006a953e2fd |
|
MD5 | 7320de4104f0f316d0a77552b815596e |
|
BLAKE2b-256 | 39d3a95074dd167e04d4dd3d86830a0b852a5246fa3cbd1e072e87a4bab1d4df |