No project description provided
Project description
text-generation-api
📢
text-generation-api
is a simple yet comprehensive REST API Server for text generation with huggingface models
Couldn't be more easy to use 🔥
Comes with batteries included 🔋
from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")
result = tga.generate(
model="gpt2",
prompt="Here is a list of things I like to do:"
)
Features 🏆
- Serve every 🤗 huggingface model 🔥
- Batteries included🔋
- Built-in stop text or stop token
- Nice one line serving and generation 😎
Installation ⚙️
(optional) Create a virtualenv
. You can use conda
or whatever you like
virtualenv --python=python3.10 text-generation-api-env
. ./text-generation-api-env/bin/activate
Install pytorch
(or tensorflow
). Again, you can use whatever package manager you like
pip install "torch>=2.0.0"
Install text-generation-api
pip install text-generation-api
Run the server 🌐
Create yaml
config files for the models you want to serve
For example to serve GPT2
the config file should look like
model:
name: gpt2
class: GPT2Model
tokenizer:
name: gpt2
class: GPT2Tokenizer
To specify load arguments for the model or the tokenizer use the load
key like:
model:
name: my-model
class: GPT2Model
load:
device_map: auto
trust_remote_code: True
tokenizer:
name: gpt2
class: GPT2Tokenizer
You can specify which device to use with device
and which backend to use with backend: pytorch
or backend: tensorflow
To run the inference server run
text-generation-api ./path/to/config1.yaml ./path/to/config2.yaml
For example:
text-generation-api ./example/gpt2.yaml ./example/opt-125.yaml
Run the client ✨
To run the inference on the remote server using the client simply:
from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")
result = tga.generate(
model="<MODEL-NAME>",
prompt="<PROMPT>"
)
You can pass generate
or tokenize
arguments with:
from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")
result = tga.generate(
model="<MODEL-NAME>",
prompt="<PROMPT>",
generate=dict(
temperature=0.2,
top_p=0.2
)
)
Use the argument stop
to control the stop beheviour
from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")
result = tga.generate(
model="gpt2",
prompt="Human: How are you?\nAI:",
generate=dict(
temperature=0.2,
top_p=0.2
),
stop=dict(
words=["\n"]
)
)
Contributions and license 🪪
The code is released as Free Software under the GNU/GPLv3 license. Copying, adapting and republishing it is not only allowed but also encouraged.
For any further question feel free to reach me at federico.galatolo@unipi.it or on Telegram @galatolo
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file text-generation-api-0.0.1.tar.gz
.
File metadata
- Download URL: text-generation-api-0.0.1.tar.gz
- Upload date:
- Size: 20.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6068cf5da9f8ca3840934267d184b69a4daa08c5a456d30e40b37288d89b8005 |
|
MD5 | e7e92772c51d7fa743d467e379212c83 |
|
BLAKE2b-256 | c6271d4e40d131b25b3f84d277432e208b3b68cee2c4c37f325c1f0f8fa13172 |
File details
Details for the file text_generation_api-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: text_generation_api-0.0.1-py3-none-any.whl
- Upload date:
- Size: 21.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24257aeccb1428605742107444b44573057c441e381f4c5c1b4e0eeda535b74f |
|
MD5 | af24e6f2c3d400bba3a5bbee3295c727 |
|
BLAKE2b-256 | 22a9a35d21523c597217be7dd691337cdb84b607537b0f4803788e9ca0c3aeac |