No project description provided
Project description
Langchain Chat Model + LLamaCPP
A working solution for Integrating LLamaCPP with langchain
Usage
import os
from llama_cpp.server.app import LlamaProxy
from llama_cpp.server.settings import ModelSettings
model_path = os.path.join(
os.path.expanduser("~/.cache/lm-studio/models"),
"lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct-Q4_K_M.gguf",
)
settings = ModelSettings(
model=model_path,
model_alias="llama3",
n_gpu_layers=-1, # Use GPU
n_ctx=1024,
n_batch=512, # Should be between 1 and n_ctx, consider the amount of RAM
offload_kqv=True, # Equivalent of f16_kv=True
chat_format="chatml-function-calling",
verbose=False,
)
self.llama_proxy = LlamaProxy(models=[settings])
chat_model = LlamaCppChatModel(llama_proxy=self.llama_proxy, model_name=self.model_alias)
chat_model.invoke("Tell me a joke")
chat_model.stream("Tell me a joke")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for langchain_llamacpp_chat_model-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 23c0fc7f028e9ab7d014d559a7d9216125b50ef972f86efcc9b8207316e09abe |
|
MD5 | c0e4d4bd52f85a2730ac955f95ed10c7 |
|
BLAKE2b-256 | 3c31477cd3f10891153d7629c70d07ac53bf4cb5cfd0119b9cdbe8a7b529055f |
Close
Hashes for langchain_llamacpp_chat_model-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fcc062a9d230635faa10394d66974bb3bb5974dd28403e995b2ece7a0eb906a1 |
|
MD5 | 3dc7d2b9929d99cc462501c6ba952916 |
|
BLAKE2b-256 | 9581cdfddacfffea4c5ecae385a8433783adc2309f2b803152420ba2d89f9102 |