LLM unified service

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Modelz LLM

Modelz LLM is an inference server that facilitates the utilization of open source large language models (LLMs), such as FastChat, LLaMA, and ChatGLM, on either local or cloud-based environments with OpenAI compatible API.

Features

OpenAI compatible API: Modelz LLM provides an OpenAI compatible API for LLMs, which means you can use the OpenAI python SDK or LangChain to interact with the model.
Self-hosted: Modelz LLM can be easily deployed on either local or cloud-based environments.
Open source LLMs: Modelz LLM supports open source LLMs, such as FastChat, LLaMA, and ChatGLM.
Cloud native: We provide docker images for different LLMs, which can be easily deployed on Kubernetes, or other cloud-based environments (e.g. Modelz)

Quick Start

Install

pip install modelz-llm
# or install from source
pip install git+https://github.com/tensorchord/modelz-llm.git[gpu]

Run the self-hosted API server

Please first start the self-hosted API server by following the instructions:

modelz-llm -m bigscience/bloomz-560m --device cpu

Currently, we support the following models:

Model Name	Huggingface Model	Docker Image	Recommended GPU
FastChat T5	`lmsys/fastchat-t5-3b-v1.0`	modelzai/llm-fastchat-t5-3b	Nvidia L4(24GB)
Vicuna 7B Delta V1.1	`lmsys/vicuna-7b-delta-v1.1`	modelzai/llm-vicuna-7b	Nvidia A100(40GB)
LLaMA 7B	`decapoda-research/llama-7b-hf`	modelzai/llm-llama-7b	Nvidia A100(40GB)
ChatGLM 6B INT4	`THUDM/chatglm-6b-int4`	modelzai/llm-chatglm-6b-int4	Nvidia T4(16GB)
ChatGLM 6B	`THUDM/chatglm-6b`	modelzai/llm-chatglm-6b	Nvidia L4(24GB)
Bloomz 560M	`bigscience/bloomz-560m`	modelzai/llm-bloomz-560m	CPU
Bloomz 1.7B	`bigscience/bloomz-1b7`		CPU
Bloomz 3B	`bigscience/bloomz-3b`		Nvidia L4(24GB)
Bloomz 7.1B	`bigscience/bloomz-7b1`		Nvidia A100(40GB)

Use OpenAI python SDK

Then you can use the OpenAI python SDK to interact with the model:

import openai
openai.api_base="http://localhost:8000"
openai.api_key="any"

# create a chat completion
chat_completion = openai.ChatCompletion.create(model="any", messages=[{"role": "user", "content": "Hello world"}])

Integrate with Langchain

You could also integrate modelz-llm with langchain:

import openai
openai.api_base="http://localhost:8000"
openai.api_key="any"

from langchain.llms import OpenAI

llm = OpenAI()

llm.generate(prompts=["Could you please recommend some movies?"])

Deploy on Modelz

You could also deploy the modelz-llm directly on Modelz:

Supported APIs

Modelz LLM supports the following APIs for interacting with open source large language models:

/completions
/chat/completions
/embeddings
/engines/<any>/embeddings
/v1/completions
/v1/chat/completions
/v1/embeddings

Acknowledgements

FastChat for the prompt generation logic.
Mosec for the inference engine.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

23.7.4

Jul 24, 2023

23.7.3

Jul 24, 2023

23.7.2

Jul 24, 2023

23.7.1

Jul 21, 2023

23.6.13

Jun 9, 2023

23.6.12

Jun 6, 2023

23.6.11

Jun 6, 2023

23.6.9

Jun 6, 2023

23.6.8

Jun 6, 2023

23.6.7

Jun 6, 2023

23.6.6

Jun 5, 2023

23.6.5

Jun 2, 2023

23.6.4

Jun 2, 2023

23.6.3

Jun 2, 2023

23.6.2

Jun 2, 2023

23.6.1

Jun 2, 2023

23.5.21

Jun 2, 2023

23.5.20

Jun 2, 2023

23.5.19

Jun 2, 2023

23.5.18

Jun 2, 2023

23.5.12

May 25, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelz-llm-23.7.4.tar.gz (21.2 kB view hashes)

Uploaded Jul 24, 2023 Source

Built Distribution

modelz_llm-23.7.4-py3-none-any.whl (12.7 kB view hashes)

Uploaded Jul 24, 2023 Python 3

Hashes for modelz-llm-23.7.4.tar.gz

Hashes for modelz-llm-23.7.4.tar.gz
Algorithm	Hash digest
SHA256	`3a5a7fcde99f82a9d6c208fcb4bac683030cf7d5c14c9caad3507cf2554226f2`
MD5	`4990fb66128a4c0626969b45cdcc501d`
BLAKE2b-256	`121cb914663cf5932e2439b5af2caae24524c077b9e17aea01d9c23f9a99bc93`

Hashes for modelz_llm-23.7.4-py3-none-any.whl

Hashes for modelz_llm-23.7.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9e7ca6182421a49ba7e43309af0991426ab82a5f6a42e28773d10ed0f7369d1d`
MD5	`d3ab0c3ab03c3f0c04bea84e9cdb4351`
BLAKE2b-256	`188b8ed002926c89fc211b65f682d5e56fa22089a2ca06e19a3586ad803bd5e5`