Skip to main content

Large Language Models Inference API and Applications

Project description

Large Language Model (LLM) Inference API and Chatbot 🦙

project banner

Inference API for LLMs like LLaMA and Falcon powered by Lit-GPT from Lightning AI

pip install llm-inference

Install from main branch

pip install git+https://github.com/aniketmaurya/llm-inference.git@main

Note: You need to manually install Lit-GPT and setup the model weights to use this project.

pip install lit_gpt@git+https://github.com/aniketmaurya/install-lit-gpt.git@install

For Inference

from llm_inference import LLMInference, prepare_weights
from rich import print

path = prepare_weights("EleutherAI/pythia-70m")
model = LLMInference(checkpoint_dir=path)

print(model("New York is located in"))

How to use the Chatbot

from llm_chain import LitGPTConversationChain, LitGPTLLM
from llm_inference import prepare_weights
from rich import print


path = str(prepare_weights("lmsys/longchat-13b-16k"))
llm = LitGPTLLM(checkpoint_dir=path, quantize="bnb.nf4")  # 8.4GB GPU memory
bot = LitGPTConversationChain.from_llm(llm=llm, verbose=True)

print(bot.send("hi, what is the capital of France?"))

Launch Chatbot App

1. Download weights

from llm_inference import prepare_weights
path = prepare_weights("lmsys/longchat-13b-16k")

2. Launch Gradio App

python examples/chatbot/gradio_demo.py

For deploying as a REST API

Create a Python file app.py and initialize the ServeLLaMA App.

# app.py
from llm_inference.serve import ServeLLaMA, Response, PromptRequest

import lightning as L

component = ServeLLaMA(input_type=PromptRequest, output_type=Response)
app = L.LightningApp(component)
lightning run app app.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_inference-0.0.6.tar.gz (810.4 kB view details)

Uploaded Source

Built Distribution

llm_inference-0.0.6-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file llm_inference-0.0.6.tar.gz.

File metadata

  • Download URL: llm_inference-0.0.6.tar.gz
  • Upload date:
  • Size: 810.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for llm_inference-0.0.6.tar.gz
Algorithm Hash digest
SHA256 5bc77680d1df2f9af5b4cc9187fd47c473a2c98a32d0c59eb7427949ad19cbfe
MD5 c813fdea738fb735019ce0ee3237a0ef
BLAKE2b-256 541756ce7b12de3af15ae7acf9358760ad9689e36392da2491888fa7299ad6b5

See more details on using hashes here.

File details

Details for the file llm_inference-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_inference-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 08fe641c4511b1ad465a503c02b58752cf785ee00f4516a3b63e99103d1ea6ac
MD5 5a50437efe21dab18314b9a3f2441bea
BLAKE2b-256 c6cd4bfeab074dd703d4977641711c16e5f4641d550c55d2890b797c71f90af7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page