Skip to main content

Large Language Models Inference API and Applications

Project description

Large Language Model (LLM) Inference API and Chatbot 🦙

project banner

Inference API for LLaMA

pip install llm-inference

# to use chatbot
pip install llm-inference[chatbot]

Install from main branch

pip install git+https://github.com/aniketmaurya/llm-inference.git@main

Note: You need to manually install Lit-GPT and setup the model weights to use this project.

pip install lit-gpt@git+https://github.com/Lightning-AI/lit-gpt.git@main

For Inference

from llm_inference import LLMInference
import os

WEIGHTS_PATH = os.environ["WEIGHTS"]

checkpoint_dir = f"checkpoints/tiiuae/falcon-7b"

model = LLMInference(checkpoint_dir=checkpoint_dir, precision="bf16-true")

print(model("New York is located in"))

For deploying as a REST API

Create a Python file app.py and initialize the ServeLLaMA App.

# app.py
from llm_inference.serve import ServeLLaMA, Response, PromptRequest

import lightning as L

component = ServeLLaMA(input_type=PromptRequest, output_type=Response)
app = L.LightningApp(component)
lightning run app app.py

How to use the Chatbot

from chatbot import LLaMAChatBot

checkpoint_dir = "weights"

bot = LLaMAChatBot(
    checkpoint_dir=checkpoint_dir)

print(bot.send("hi, what is the capital of France?"))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_inference-0.0.4.tar.gz (414.5 kB view hashes)

Uploaded Source

Built Distribution

llm_inference-0.0.4-py3-none-any.whl (10.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page