No project description provided
Project description
Langchain Chat Model + LLamaCPP
A working solution for Integrating LLamaCPP with langchain
Usage
import os
from llama_cpp.server.app import LlamaProxy
from llama_cpp.server.settings import ModelSettings
model_path = os.path.join(
os.path.expanduser("~/.cache/lm-studio/models"),
"lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct-Q4_K_M.gguf",
)
settings = ModelSettings(
model=model_path,
model_alias="llama3",
n_gpu_layers=-1, # Use GPU
n_ctx=1024,
n_batch=512, # Should be between 1 and n_ctx, consider the amount of RAM
offload_kqv=True, # Equivalent of f16_kv=True
chat_format="chatml-function-calling",
verbose=False,
)
self.llama_proxy = LlamaProxy(models=[settings])
chat_model = LlamaCppChatModel(llama_proxy=self.llama_proxy, model_name=self.model_alias)
chat_model.invoke("Tell me a joke")
chat_model.stream("Tell me a joke")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for langchain_llamacpp_chat_model-0.1.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 50067330bcef3822d029c520d31fcd9ea4c11affa59d5bf8eef32fa20ee9ad6d |
|
MD5 | 1cea9fa28bee5e222a1abcc466b9d6e1 |
|
BLAKE2b-256 | 86188b267d62c8022e9c283dd94c8571241178e57fd00bbb7681511706a7fac2 |
Close
Hashes for langchain_llamacpp_chat_model-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f63105b82f700f7eb3bbec9b5d343c98488036ba7884934576055703006d4b88 |
|
MD5 | ea6cfa7026209d1a7804d18cdd81bd47 |
|
BLAKE2b-256 | 693417729ccb88156a1bf50fda3f0909e88a999deb612f8329358daf5ac96640 |