todo
Project description
gpt-blazing
This project draws inspiration from gpt-fast and applies the same performance optimization strategy to MORE models. Unlike gpt-fast, this project aims to be a “framework” or “library”.
Installation
pip install --pre torch==2.2.0.dev20231207 --index-url https://download.pytorch.org/whl/nightly/cu118
pip install --pre gpt-blazing
Usage
Download a gpt-blazing converted model.
Original model | 👇👇 gpt-blazing converted model |
---|---|
🤗 baichuan-inc/Baichuan2-13B-Chat | 🤗 gpt-blazing/baichuan2-13b-chat |
more to be supported | ... |
Run the following demo.
from datetime import datetime
from gpt_blazing.engine import Engine
from gpt_blazing.model.interface import Role
from gpt_blazing.model.baichuan2.inference import (
Baichuan2ModelInferenceConfig,
Baichuan2ModelInference,
)
init_dt_begin = datetime.now()
engine = Engine(
Baichuan2ModelInference(
Baichuan2ModelInferenceConfig(
model_folder='the path of model folder you just downloaded.',
device='cuda:0',
)
)
)
init_dt_end = datetime.now()
print('init:', (init_dt_end - init_dt_begin).total_seconds())
generate_dt_begin = datetime.now()
response = engine.generate([(Role.USER, "帮我写一篇与A股主题相关的作文,800字左右")])
generate_dt_end = datetime.now()
generate_total_seconds = (generate_dt_end - generate_dt_begin).total_seconds()
print('generate:', generate_total_seconds, response.num_tokens / generate_total_seconds)
print(response.content)
Performance
GPU: 3090
Model | Technique | Tokens/Second |
---|---|---|
Baichuan2 13b | INT8 (this project) | 50.1 |
Baichuan2 13b | INT8 (huggingface) | 7.9 |
Llama2 13b | INT8 (gpt-fast) | 55.5 |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Close
Hashes for gpt_blazing-23.1.0.dev8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7b11d3fd3e2de6ef188c98ab855a436dadcbd10bf6af7fdff3c9b269f329d7c7 |
|
MD5 | 7fd600ae15963ceb0840b5c3529a3600 |
|
BLAKE2b-256 | 7e1a2df447aa3da55f1dc5449f94d77418e9a832c216e927fc8d01bfb3ed860e |