Skip to main content

No project description provided

Project description

text-generation-api

📢 text-generation-api is a simple yet comprehensive REST API Server for text generation with huggingface models

Couldn't be more easy to use 🔥

Comes with batteries included 🔋

from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")

result = tga.generate(
    model="gpt2",
    prompt="Here is a list of things I like to do:"
)

Features 🏆

  • Serve every 🤗 huggingface model 🔥
  • Batteries included🔋
  • Built-in stop text or stop token
  • Nice one line serving and generation 😎

Installation ⚙️

(optional) Create a virtualenv. You can use conda or whatever you like

virtualenv --python=python3.10 text-generation-api-env
. ./text-generation-api-env/bin/activate

Install pytorch (or tensorflow). Again, you can use whatever package manager you like

pip install "torch>=2.0.0"

Install text-generation-api

pip install text-generation-api

Run the server 🌐

Create yaml config files for the models you want to serve

For example to serve GPT2 the config file should look like

model:
  name: gpt2
  class: GPT2Model

tokenizer:
  name: gpt2
  class: GPT2Tokenizer

To specify load arguments for the model or the tokenizer use the load key like:

model:
  name: my-model
  class: GPT2Model
  load:
    device_map: auto
    trust_remote_code: True

tokenizer:
  name: gpt2
  class: GPT2Tokenizer

You can specify which device to use with device and which backend to use with backend: pytorch or backend: tensorflow

To run the inference server run

text-generation-api ./path/to/config1.yaml ./path/to/config2.yaml

For example:

text-generation-api ./example/gpt2.yaml  ./example/opt-125.yaml

Run the client ✨

To run the inference on the remote server using the client simply:

from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")

result = tga.generate(
    model="<MODEL-NAME>",
    prompt="<PROMPT>"
)

You can pass generate or tokenize arguments with:

from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")

result = tga.generate(
    model="<MODEL-NAME>",
    prompt="<PROMPT>",
    generate=dict(
        temperature=0.2,
        top_p=0.2
    )
)

Use the argument stop to control the stop beheviour

from text_generation_api import Endpoint
tga = Endpoint("http://<host>:<port>")

result = tga.generate(
    model="gpt2",
    prompt="Human: How are you?\nAI:",
    generate=dict(
        temperature=0.2,
        top_p=0.2
    ),
    stop=dict(
        words=["\n"]
    )
)

Contributions and license 🪪

The code is released as Free Software under the GNU/GPLv3 license. Copying, adapting and republishing it is not only allowed but also encouraged.

For any further question feel free to reach me at federico.galatolo@unipi.it or on Telegram @galatolo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

text-generation-api-0.0.1.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

text_generation_api-0.0.1-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file text-generation-api-0.0.1.tar.gz.

File metadata

  • Download URL: text-generation-api-0.0.1.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for text-generation-api-0.0.1.tar.gz
Algorithm Hash digest
SHA256 6068cf5da9f8ca3840934267d184b69a4daa08c5a456d30e40b37288d89b8005
MD5 e7e92772c51d7fa743d467e379212c83
BLAKE2b-256 c6271d4e40d131b25b3f84d277432e208b3b68cee2c4c37f325c1f0f8fa13172

See more details on using hashes here.

File details

Details for the file text_generation_api-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for text_generation_api-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 24257aeccb1428605742107444b44573057c441e381f4c5c1b4e0eeda535b74f
MD5 af24e6f2c3d400bba3a5bbee3295c727
BLAKE2b-256 22a9a35d21523c597217be7dd691337cdb84b607537b0f4803788e9ca0c3aeac

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page