Python binding for llama.cpp using cffi
Project description
llama-cpp-cffi
Python binding for llama.cpp using cffi and ctypes. Supports CPU and CUDA 12.5 execution.
Install
Basic library install:
pip install llama-cpp-cffi
In case you want Chat Completions API by OpenAI © compatible API:
pip install llama-cpp-cffi[openai]
Example
Library Usage
examples/demo_0.py
from llama.llama_cli_cffi_cpu import llama_generate, Model, Options
# from llama.llama_cli_cffi_cuda_12_5 import llama_generate, Model, Options
# from llama.llama_cli_ctypes_cuda import llama_generate, Model, Options
# from llama.llama_cli_ctypes_cuda_12_5 import llama_generate, Model, Options
from llama.formatter import get_config
model = Model(
creator_hf_repo='TinyLlama/TinyLlama-1.1B-Chat-v1.0',
hf_repo='TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF',
hf_file='tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf',
)
config = get_config(model.creator_hf_repo)
messages = [
{'role': 'system', 'content': 'You are a helpful assistant.'},
{'role': 'user', 'content': 'Evaluate 1 + 2 in Python.'},
]
options = Options(
ctx_size=config.max_position_embeddings,
predict=-2,
model=model,
prompt=messages,
)
for chunk in llama_generate(options):
print(chunk, flush=True, end='')
# newline
print()
OpenAI © compatible Chat Completions (TBD)
examples/demo_1.py
</code></pre>
<h2>Demos</h2>
<pre lang="BASH"><code>#
# run demos
#
python -B examples/demo_cffi_cpu.py
python -B examples/demo_cffi_cuda_12_5.py
python -B examples/demo_ctypes_cpu.py
python -B examples/demo_ctypes_cuda_12_5.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distributions
Close
Hashes for llama_cpp_cffi-0.1.2-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87ff852be530251aec63dce30bcdfc54b1b7ef2f592b16b32dc5775fc06ddb89 |
|
MD5 | 124f27ae75b821a31144219b98cdaeff |
|
BLAKE2b-256 | 9113efd2a301efd7e80eae330f548461c81c33a300ed2c2cc2b5effa58167292 |
Close
Hashes for llama_cpp_cffi-0.1.2-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 93e19871bcc2105cebcd3d14d6bfd7ee0e4072006351988ebcf0b42a31fd2225 |
|
MD5 | 254cc057049975857a11b6ba3027952b |
|
BLAKE2b-256 | 0f50ecf56c6bf9ebf2c618119d9701ac20be854a1cda1aac33839002b3e47e1d |
Close
Hashes for llama_cpp_cffi-0.1.2-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a332cfa2dad5164912f6167201c5186f06ab19ff7a1e80711768875c7fa27b10 |
|
MD5 | 4884cbb12dc343430d40bf4b19a26c9b |
|
BLAKE2b-256 | 021d32395a5543db39c7ba58db341688a2b1a522c58b597e8c081ebf51d82e28 |