Large Language Models Inference API and Applications
Project description
Large Language Model (LLM) Inference API and Chatbot 🦙
Inference API for LLaMA
pip install llm-inference
# to use chatbot
pip install llm-inference[chatbot]
Install from main branch
pip install git+https://github.com/aniketmaurya/llm-inference.git@main
Note: You need to manually install Lit-GPT and setup the model weights to use this project.
pip install lit-gpt@git+https://github.com/Lightning-AI/lit-gpt.git@main
For Inference
from llm_inference import LLMInference
import os
WEIGHTS_PATH = os.environ["WEIGHTS"]
checkpoint_dir = f"checkpoints/tiiuae/falcon-7b"
model = LLMInference(checkpoint_dir=checkpoint_dir, precision="bf16-true")
print(model("New York is located in"))
For deploying as a REST API
Create a Python file app.py
and initialize the ServeLLaMA
App.
# app.py
from llm_inference.serve import ServeLLaMA, Response, PromptRequest
import lightning as L
component = ServeLLaMA(input_type=PromptRequest, output_type=Response)
app = L.LightningApp(component)
lightning run app app.py
How to use the Chatbot
from chatbot import LLaMAChatBot
checkpoint_dir = "weights"
bot = LLaMAChatBot(
checkpoint_dir=checkpoint_dir)
print(bot.send("hi, what is the capital of France?"))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
llm_inference-0.0.4.tar.gz
(414.5 kB
view hashes)
Built Distribution
Close
Hashes for llm_inference-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5cbaddc39176e2b81ecd550ac16e680eab125f8e216e72a2595b66bc5073819e |
|
MD5 | 3b127708ef662e34eec1a3ebbfe1dde0 |
|
BLAKE2b-256 | 82e722774874c64768a90cff3a2a1d5dad6c0f8fcab78c992c8e1dbc5f268aaa |