Project description

Large Language Model (LLM) Inference API and Chatbot 🦙

project banner

Inference API for LLaMA

pip install llm-inference

# to use chatbot
pip install llm-inference[chatbot]

Install from main branch

pip install git+https://github.com/aniketmaurya/llm-inference.git@main

Note: You need to manually install Lit-GPT and setup the model weights to use this project.

pip install lit-gpt@git+https://github.com/Lightning-AI/lit-gpt.git@main

For Inference

from llm_inference import LLMInference
import os

WEIGHTS_PATH = os.environ["WEIGHTS"]

checkpoint_dir = f"checkpoints/tiiuae/falcon-7b"

model = LLMInference(checkpoint_dir=checkpoint_dir, precision="bf16-true")

print(model("New York is located in"))

For deploying as a REST API

Create a Python file app.py and initialize the ServeLLaMA App.

# app.py
from llm_inference.serve import ServeLLaMA, Response, PromptRequest

import lightning as L

component = ServeLLaMA(input_type=PromptRequest, output_type=Response)
app = L.LightningApp(component)

lightning run app app.py

How to use the Chatbot

from chatbot import LLaMAChatBot

checkpoint_dir = "weights"

bot = LLaMAChatBot(
    checkpoint_dir=checkpoint_dir)

print(bot.send("hi, what is the capital of France?"))

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.6

Jul 16, 2023

0.0.5

Jul 6, 2023

0.0.5.dev0 pre-release

Jul 6, 2023

This version

0.0.4

Jul 4, 2023

0.0.0

Jul 4, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_inference-0.0.4.tar.gz (414.5 kB view hashes)

Uploaded Jul 4, 2023 Source

Built Distribution

llm_inference-0.0.4-py3-none-any.whl (10.4 kB view hashes)

Uploaded Jul 4, 2023 Python 3

Hashes for llm_inference-0.0.4.tar.gz

Hashes for llm_inference-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`99d6aa68b3858f657fdb9a6e2dab791936fc820c8a472ffda55d60836f0eb660`
MD5	`e0d08b56e0787ae71877bb59eaadf69f`
BLAKE2b-256	`d1445ee70859c66498cc47410cf81488713ef75b3a25a68fbd15184664c5fced`

Hashes for llm_inference-0.0.4-py3-none-any.whl

Hashes for llm_inference-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5cbaddc39176e2b81ecd550ac16e680eab125f8e216e72a2595b66bc5073819e`
MD5	`3b127708ef662e34eec1a3ebbfe1dde0`
BLAKE2b-256	`82e722774874c64768a90cff3a2a1d5dad6c0f8fcab78c992c8e1dbc5f268aaa`