No project description provided
Project description
Langchain Chat Model + LLamaCPP
A working solution for Integrating LLamaCPP with langchain
Usage
import os
from llama_cpp.server.app import LlamaProxy
from llama_cpp.server.settings import ModelSettings
model_path = os.path.join(
os.path.expanduser("~/.cache/lm-studio/models"),
"lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct-Q4_K_M.gguf",
)
settings = ModelSettings(
model=model_path,
model_alias="llama3",
n_gpu_layers=-1, # Use GPU
n_ctx=1024,
n_batch=512, # Should be between 1 and n_ctx, consider the amount of RAM
offload_kqv=True, # Equivalent of f16_kv=True
chat_format="chatml-function-calling",
verbose=False,
)
self.llama_proxy = LlamaProxy(models=[settings])
chat_model = LlamaCppChatModel(llama_proxy=self.llama_proxy, model_name=self.model_alias)
chat_model.invoke("Tell me a joke")
chat_model.stream("Tell me a joke")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for langchain_llamacpp_chat_model-0.1.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 992e66fdae75ae101c0ba62faab9c731579ec66640876a003f96d6d65c858c6f |
|
MD5 | 94af953f6ce13eebd3c6d5805baa8282 |
|
BLAKE2b-256 | aef1201a84a071f58f8c0a073cc0f9f24bc210ff17dd08e3a0f09defe7e74858 |
Close
Hashes for langchain_llamacpp_chat_model-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a23442517c15e8055c0912d534fd99a0b4d67236eab472f06f2ebc15f7bb7f3 |
|
MD5 | 7958e6980405b3719a30f423c5b4b441 |
|
BLAKE2b-256 | 34a47f18ef67627033989efc50204dbc6e053b32c25874c7b750d66b850e169a |