langchain llm wrapper
Project description
Langchain LLM
Get Started
Install
pip install langchain_llm==0.1.19
Inference Usage
HuggingFace Inference
Completion Usage
from langchain_llm import HuggingFaceLLM
llm = HuggingFaceLLM(
model_name="qwen-7b-chat",
model_path="/data/checkpoints/Qwen-7B-Chat",
load_model_kwargs={"device_map": "auto"},
)
# invoke method
prompt = "<|im_start|>user\n你是谁?<|im_end|>\n<|im_start|>assistant\n"
print(llm.invoke(prompt, stop=["<|im_end|>"]))
# Token Streaming
for chunk in llm.stream(prompt, stop=["<|im_end|>"]):
print(chunk, end="", flush=True)
# openai usage
print(llm.call_as_openai(prompt, stop=["<|im_end|>"]))
# Streaming
for chunk in llm.call_as_openai(prompt, stop=["<|im_end|>"], stream=True):
print(chunk.choices[0].text, end="", flush=True)
Chat Completion Usage
from langchain_llm import ChatHuggingFace
chat_llm = ChatHuggingFace(llm=llm)
# invoke method
query = "你是谁?"
print(chat_llm.invoke(query))
# Token Streaming
for chunk in chat_llm.stream(query):
print(chunk.content, end="", flush=True)
# openai usage
messages = [
{"role": "user", "content": query}
]
print(chat_llm.call_as_openai(messages))
# Streaming
for chunk in chat_llm.call_as_openai(messages, stream=True):
print(chunk.choices[0].delta.content or "", end="", flush=True)
VLLM Inference
Completion Usage
from langchain_llm import VLLM
llm = VLLM(
model_name="qwen",
model="/data/checkpoints/Qwen-7B-Chat",
trust_remote_code=True,
)
# invoke method
prompt = "<|im_start|>user\n你是谁?<|im_end|>\n<|im_start|>assistant\n"
print(llm.invoke(prompt, stop=["<|im_end|>"]))
# openai usage
print(llm.call_as_openai(prompt, stop=["<|im_end|>"]))
Chat Completion Usage
from langchain_llm import ChatVLLM
chat_llm = ChatVLLM(llm=llm)
# invoke method
query = "你是谁?"
print(chat_llm.invoke(query))
# openai usage
messages = [
{"role": "user", "content": query}
]
print(chat_llm.call_as_openai(messages))
Custom Chat template
from langchain_llm import BaseTemplate, ChatHuggingFace
class CustomTemplate(BaseTemplate):
@property
def template(self) -> str:
return (
"{% for message in messages %}"
"{{ '<|im_start|>' + message['role'] + '\\n' + message['content'] + '<|im_end|>' + '\\n' }}"
"{% endfor %}"
"{% if add_generation_prompt %}"
"{{ '<|im_start|>assistant\\n' }}"
"{% endif %}"
)
chat_llm = ChatHuggingFace(
llm=llm,
prompt_adapter=CustomTemplate()
)
Load Model Kwargs
-
model_name_or_path
: model name or path. -
use_fast_tokenizer
: default false. -
device_map
: "auto"、"cuda:0" etc. -
dtype
: "half", "bfloat16", "float32". -
load_in_8bit
: Load model in 8 bit. -
load_in_4bit
: Load model in 4 bit. -
rope_scaling
: Which scaling strategy should be adopted for the RoPE embeddings. Literal["linear", "dynamic"]. -
flash_attn
: Enable FlashAttention-2.
Merge Lora model
from langchain_llm import apply_lora
apply_lora("base_model_path", "lora_path", "target_model_path")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
langchain_llm-0.4.15.tar.gz
(30.6 kB
view details)
Built Distribution
File details
Details for the file langchain_llm-0.4.15.tar.gz
.
File metadata
- Download URL: langchain_llm-0.4.15.tar.gz
- Upload date:
- Size: 30.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.10 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7afa83011f1b0a486c2a52838b81dcfe0c207d59e28bd677e9bee61346421290 |
|
MD5 | b0509b9e3bd7ddaa80935ef3a1cca683 |
|
BLAKE2b-256 | 8c2f1f0082d992536658497eb032ed0e9106ec6efdd2972fb3d74e5589eda64f |
File details
Details for the file langchain_llm-0.4.15-py3-none-any.whl
.
File metadata
- Download URL: langchain_llm-0.4.15-py3-none-any.whl
- Upload date:
- Size: 38.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.10 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3e4227ed357673b2ba05d84724e1155269b0388e08a43f1f23f9cf24b017bc58 |
|
MD5 | 2f4960c47884f323a605813e47feec06 |
|
BLAKE2b-256 | dc0847658a83f639ea69b6ea58e6056e6bf1bf433ab5a07687caf6cbb369a812 |