No project description provided
Project description
text-generation-api
📢
text-generation-api
is a simple yet comprehensive REST API Server for text generation with huggingface models
Couldn't be more easy to use 🔥
Comes with batteries included 🔋
from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")
result = tga.generate(
model="gpt2",
prompt="Here is a list of things I like to do:"
)
Features 🏆
- Serve every 🤗 huggingface model 🔥
- Batteries included🔋
- Built-in stop text or stop token
- Nice one line serving and generation 😎
Installation ⚙️
(optional) Create a virtualenv
. You can use conda
or whatever you like
virtualenv --python=python3.10 text-generation-api-env
. ./text-generation-api-env/bin/activate
Install pytorch
(or tensorflow
). Again, you can use whatever package manager you like
pip install "torch>=2.0.0"
Install text-generation-api
pip install text-generation-api
Run the server 🌐
Create yaml
config files for the models you want to serve
For example to serve GPT2
the config file should look like
model:
name: gpt2
class: GPT2Model
tokenizer:
name: gpt2
class: GPT2Tokenizer
To specify load arguments for the model or the tokenizer use the load
key like:
model:
name: my-model
class: GPT2Model
load:
device_map: auto
trust_remote_code: True
tokenizer:
name: gpt2
class: GPT2Tokenizer
You can specify which device to use with device
and which backend to use with backend: pytorch
or backend: tensorflow
To run the inference server run
text-generation-api ./path/to/config1.yaml ./path/to/config2.yaml
For example:
text-generation-api ./example/gpt2.yaml ./example/opt-125.yaml
Run the client ✨
To run the inference on the remote server using the client simply:
from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")
result = tga.generate(
model="<MODEL-NAME>",
prompt="<PROMPT>"
)
You can pass generate
or tokenize
arguments with:
from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")
result = tga.generate(
model="<MODEL-NAME>",
prompt="<PROMPT>",
generate=dict(
temperature=0.2,
top_p=0.2
)
)
Use the argument stop
to control the stop beheviour
from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")
result = tga.generate(
model="gpt2",
prompt="Human: How are you?\nAI:",
generate=dict(
temperature=0.2,
top_p=0.2
),
stop=dict(
words=["\n"]
)
)
Contributions and license 🪪
The code is released as Free Software under the GNU/GPLv3 license. Copying, adapting and republishing it is not only allowed but also encouraged.
For any further question feel free to reach me at federico.galatolo@unipi.it or on Telegram @galatolo
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for text-generation-api-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6068cf5da9f8ca3840934267d184b69a4daa08c5a456d30e40b37288d89b8005 |
|
MD5 | e7e92772c51d7fa743d467e379212c83 |
|
BLAKE2b-256 | c6271d4e40d131b25b3f84d277432e208b3b68cee2c4c37f325c1f0f8fa13172 |
Hashes for text_generation_api-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24257aeccb1428605742107444b44573057c441e381f4c5c1b4e0eeda535b74f |
|
MD5 | af24e6f2c3d400bba3a5bbee3295c727 |
|
BLAKE2b-256 | 22a9a35d21523c597217be7dd691337cdb84b607537b0f4803788e9ca0c3aeac |