Skip to main content

Large Language Models Inference API and Applications

Project description

Large Language Model (LLM) Inference API and Chatbot 🦙

project banner

Inference API for LLMs like LLaMA and Falcon powered by Lit-GPT from Lightning AI

pip install llm-inference

Install from main branch

pip install git+https://github.com/aniketmaurya/llm-inference.git@main

Note: You need to manually install Lit-GPT and setup the model weights to use this project.

pip install lit_gpt@git+https://github.com/aniketmaurya/install-lit-gpt.git@install

For Inference

from llm_inference import LLMInference, prepare_weights
from rich import print

path = prepare_weights("EleutherAI/pythia-70m")
model = LLMInference(checkpoint_dir=path)

print(model("New York is located in"))

How to use the Chatbot

from llm_chain import LitGPTConversationChain, LitGPTLLM
from llm_inference import prepare_weights
from rich import print


path = str(prepare_weights("lmsys/longchat-13b-16k"))
llm = LitGPTLLM(checkpoint_dir=path, quantize="bnb.nf4")  # 8.4GB GPU memory
bot = LitGPTConversationChain.from_llm(llm=llm, verbose=True)

print(bot.send("hi, what is the capital of France?"))

Launch Chatbot App

1. Download weights

from llm_inference import prepare_weights
path = prepare_weights("lmsys/longchat-13b-16k")

2. Launch Gradio App

python examples/chatbot/gradio_demo.py

For deploying as a REST API

Create a Python file app.py and initialize the ServeLLaMA App.

# app.py
from llm_inference.serve import ServeLLaMA, Response, PromptRequest

import lightning as L

component = ServeLLaMA(input_type=PromptRequest, output_type=Response)
app = L.LightningApp(component)
lightning run app app.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_inference-0.0.6.tar.gz (810.4 kB view hashes)

Uploaded Source

Built Distribution

llm_inference-0.0.6-py3-none-any.whl (12.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page