Skip to main content

todo

Project description

gpt-blazing

This project draws inspiration from gpt-fast and applies the same performance optimization strategy to MORE models. Unlike gpt-fast, this project aims to be a “framework” or “library”.

Installation

pip install --pre torch==2.2.0.dev20231207 --index-url https://download.pytorch.org/whl/nightly/cu118
pip install --pre gpt-blazing

Usage

Download a gpt-blazing converted model.

Original model 👇👇 gpt-blazing converted model
🤗 baichuan-inc/Baichuan2-13B-Chat 🤗 gpt-blazing/baichuan2-13b-chat
more to be supported ...

Run the following demo.

from datetime import datetime

from gpt_blazing.engine import Engine
from gpt_blazing.model.interface import Role
from gpt_blazing.model.baichuan2.inference import (
    Baichuan2ModelInferenceConfig,
    Baichuan2ModelInference,
)


init_dt_begin = datetime.now()
engine = Engine(
    Baichuan2ModelInference(
        Baichuan2ModelInferenceConfig(
            model_folder='the path of model folder you just downloaded.',
            device='cuda:0',
        )
    )
)
init_dt_end = datetime.now()
print('init:', (init_dt_end - init_dt_begin).total_seconds())

generate_dt_begin = datetime.now()
response = engine.generate([(Role.USER, "帮我写一篇与A股主题相关的作文,800字左右")])
generate_dt_end = datetime.now()
generate_total_seconds = (generate_dt_end - generate_dt_begin).total_seconds()
print('generate:', generate_total_seconds, response.num_tokens / generate_total_seconds)

print(response.content)

Performance

GPU: 3090

Model Technique Tokens/Second
Baichuan2 13b INT8 (this project) 50.1
Baichuan2 13b INT8 (huggingface) 7.9
Llama2 13b INT8 (gpt-fast) 55.5

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

gpt_blazing-23.1.0.dev8-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file gpt_blazing-23.1.0.dev8-py3-none-any.whl.

File metadata

  • Download URL: gpt_blazing-23.1.0.dev8-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/42.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.5 tqdm/4.66.1 importlib-metadata/4.6.4 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.12

File hashes

Hashes for gpt_blazing-23.1.0.dev8-py3-none-any.whl
Algorithm Hash digest
SHA256 7b11d3fd3e2de6ef188c98ab855a436dadcbd10bf6af7fdff3c9b269f329d7c7
MD5 7fd600ae15963ceb0840b5c3529a3600
BLAKE2b-256 7e1a2df447aa3da55f1dc5449f94d77418e9a832c216e927fc8d01bfb3ed860e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page