Skip to main content

OpenLLM: REST/gRPC API server for running any open Large-Language Model - StableLM, Llama, Alpaca, Dolly, Flan-T5, Custom

Project description

OpenLLM


REST/gRPC API server for running any Open Large-Language Model - StableLM, Llama, Alpaca, Dolly, Flan-T5, and more
Powered by BentoML 🍱 + HuggingFace 🤗

To get started, simply install OpenLLM with pip:

pip install openllm

To start a LLM server, openllm start allows you to start any supported LLM with a single command. For example, to start a dolly-v2 server:

😌 tl;dr?

openllm start dolly-v2

# Starting LLM Server for 'dolly_v2'
#
# 2023-05-27T04:55:36-0700 [INFO] [cli] Environ for worker 0: set CPU thread coun t to 10
# 2023-05-27T04:55:36-0700 [INFO] [cli] Prometheus metrics for HTTP BentoServer f rom "_service.py:svc" can be accessed at http://localhost:3000/metrics.
# 2023-05-27T04:55:36-0700 [INFO] [cli] Starting production HTTP BentoServer from "_service.py:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)

To see a list of supported LLMs, run openllm start --help.

On a different terminal window, open a IPython session and create a client to start interacting with the model:

>>> import openllm
>>> client = openllm.client.HTTPClient('http://localhost:3000')
>>> client.query('Explain to me the difference between "further" and "farther"')

To package the LLM into a Bento, simply use openllm build:

openllm build dolly-v2

NOTE: To build OpenLLM from git source, pass in OPENLLM_DEV_BUILD=True to include the generated wheels into the bundle.

To fine-tune your own LLM, either use LLM.tuning():

>>> import openllm
>>> flan_t5 = openllm.LLM.from_pretrained("flan-t5")
>>> def fine_tuning():
...     fined_tune = flan_t5.tuning(method=openllm.tune.LORA | openllm.tune.P_TUNING, dataset='wikitext-2', ...)
...     fined_tune.save_pretrained('./fine-tuned-flan-t5', version='wikitext')
...     return fined_tune.path  # get the path of the pretrained
>>> finetune_path = fine_tuning()
>>> fined_tune_flan_t5 = openllm.LLM.from_pretrained('flan-t5', pretrained=finetune_path)
>>> fined_tune_flan_t5.generate('Explain to me the difference between "further" and "farther"')

📚 Features

🚂 SOTA LLMs: One-click stop-and-go supports for state-of-the-art LLMs, including StableLM, Llama, Alpaca, Dolly, Flan-T5, ChatGLM, Falcon, and more.

📦 Fine-tuning your own LLM: Easily fine-tune any LLM with LLM.tuning().

🔥 BentoML 🤝 HuggingFace: Built on top of BentoML and HuggingFace's ecosystem (transformers, optimum, peft, accelerate, datasets), provides similar APIs for ease-of-use.

⛓️ Interoperability: First class support for LangChain and 🤗 Hub allows you to easily chain LLMs together.

🎯 Streamline production deployment: Easily deploy any LLM via openllm bundle with the following:

  • ☁️ BentoML Cloud: the fastest way to deploy your bento, simple and at scale
  • 🦄️ Yatai: Model Deployment at scale on Kubernetes
  • 🚀 bentoctl: Fast model deployment on AWS SageMaker, Lambda, ECE, GCP, Azure, Heroku, and more!

🍇 Telemetry

OpenLLM collects usage data that helps the team to improve the product. Only OpenLLM's internal API calls are being reported. We strip out as much potentially sensitive information as possible, and we will never collect user code, model data, or stack traces. Here's the code for usage tracking. You can opt-out of usage tracking by the --do-not-track CLI option:

openllm [command] --do-not-track

Or by setting environment variable OPENLLM_DO_NOT_TRACK=True:

export OPENLLM_DO_NOT_TRACK=True

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openllm-0.0.21.tar.gz (80.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openllm-0.0.21-py3-none-any.whl (105.1 kB view details)

Uploaded Python 3

File details

Details for the file openllm-0.0.21.tar.gz.

File metadata

  • Download URL: openllm-0.0.21.tar.gz
  • Upload date:
  • Size: 80.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.24.1

File hashes

Hashes for openllm-0.0.21.tar.gz
Algorithm Hash digest
SHA256 d78b3eb14ad7d5ced4d65bb227f25c8adb2b821a593275564e31bc4d9e66e6be
MD5 ca44c0877d322693a53cc164890adf48
BLAKE2b-256 6b79c80bbac02ce5db3295322f3548cbff6d893d87f10b6cd7b1e6547974231e

See more details on using hashes here.

File details

Details for the file openllm-0.0.21-py3-none-any.whl.

File metadata

  • Download URL: openllm-0.0.21-py3-none-any.whl
  • Upload date:
  • Size: 105.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.24.1

File hashes

Hashes for openllm-0.0.21-py3-none-any.whl
Algorithm Hash digest
SHA256 8ed8a1970c9cc1df22c3e41ec0084934bedad1a8e6b99cd2a5c44e71ec1304d8
MD5 17955e3dc40a7df34a45fd61cd0b4df2
BLAKE2b-256 95bac82302adcc5face0caae44ae822502590bf2817b9842c9d67c556ab11dbf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page