Large Language Models Inference API and Applications
Project description
Large Language Model (LLM) Inference API and Chatbot 🦙
Inference API for LLMs like LLaMA and Falcon powered by Lit-GPT from Lightning AI
pip install llm-inference
Install from main branch
pip install git+https://github.com/aniketmaurya/llm-inference.git@main
Note: You need to manually install Lit-GPT and setup the model weights to use this project.
pip install lit_gpt@git+https://github.com/aniketmaurya/install-lit-gpt.git@install
For Inference
from llm_inference import LLMInference, prepare_weights
from rich import print
path = prepare_weights("EleutherAI/pythia-70m")
model = LLMInference(checkpoint_dir=path)
print(model("New York is located in"))
How to use the Chatbot
from chatbot import LitGPTConversationChain, LitGPTLLM
from llm_inference import prepare_weights
from rich import print
path = str(prepare_weights("lmsys/longchat-7b-16k"))
llm = LitGPTLLM(checkpoint_dir=path)
bot = LitGPTConversationChain.from_llm(llm=llm, verbose=True)
print(bot.send("hi, what is the capital of France?"))
Launch Chatbot App
1. Download weights
from llm_inference import prepare_weights
path = prepare_weights("lmsys/longchat-7b-16k")
2. Launch Gradio App
python examples/chatbot/gradio_demo.py
For deploying as a REST API
Create a Python file app.py
and initialize the ServeLLaMA
App.
# app.py
from llm_inference.serve import ServeLLaMA, Response, PromptRequest
import lightning as L
component = ServeLLaMA(input_type=PromptRequest, output_type=Response)
app = L.LightningApp(component)
lightning run app app.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
llm_inference-0.0.5.tar.gz
(810.2 kB
view hashes)
Built Distribution
Close
Hashes for llm_inference-0.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b74a50a75564146faa925585cbdff5e82e5e6f2a7169459c6a3776b6890f9a5 |
|
MD5 | 3172e43239be2a7a23bd1f1c0b20986c |
|
BLAKE2b-256 | ad27cb93f693a9c513fb2ed72264852da58ae3075b42ed72d39f89de62fea18a |