No project description provided
Project description
Langchain Chat Model + LLamaCPP
A working solution for Integrating LLamaCPP with langchain
Usage
import os
from llama_cpp.server.app import LlamaProxy
from llama_cpp.server.settings import ModelSettings
model_path = os.path.join(
os.path.expanduser("~/.cache/lm-studio/models"),
"lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct-Q4_K_M.gguf",
)
settings = ModelSettings(
model=model_path,
model_alias="llama3",
n_gpu_layers=-1, # Use GPU
n_ctx=1024,
n_batch=512, # Should be between 1 and n_ctx, consider the amount of RAM
offload_kqv=True, # Equivalent of f16_kv=True
chat_format="chatml-function-calling",
verbose=False,
)
self.llama_proxy = LlamaProxy(models=[settings])
chat_model = LlamaCppChatModel(llama_proxy=self.llama_proxy, model_name=self.model_alias)
chat_model.invoke("Tell me a joke")
chat_model.stream("Tell me a joke")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for langchain_llamacpp_chat_model-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1837b63a1c144968d889e6b10760c478dde9ad87296feec1cff1f11e2cc363ad |
|
MD5 | 4d98dc91446d7fcddd26066a3955c912 |
|
BLAKE2b-256 | 63bf22ac5bbb248afebfe1a8114feadef7716455bdcf24cfeec343c9de6ce079 |
Close
Hashes for langchain_llamacpp_chat_model-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 770d41be9b61c721f22e8a6ce049a36ee9078eea7665f1a0ee0f117ac6e822d1 |
|
MD5 | 6b52fd07abe06bf8f44ae78f6bc884e0 |
|
BLAKE2b-256 | 996f3718650f48e571bf2f3942dc0f84c15684f1826f97eb18a549a7db877a9a |