Skip to main content

Python binding for llama.cpp using cffi

Project description

llama-cpp-cffi

Downloads Supported Versions License: MIT

Python binding for llama.cpp using cffi and ctypes. Supports CPU and CUDA 12.5 execution.

Install

pip install llama-cpp-cffi

Example

from llama.llama_cli_cffi_cpu import llama_generate, Model, Options
# from llama.llama_cli_cffi_cuda_12_5 import llama_generate, Model, Options
# from llama.llama_cli_ctypes_cuda import llama_generate, Model, Options
# from llama.llama_cli_ctypes_cuda_12_5 import llama_generate, Model, Options

from llama.formatter import get_config

model = Model(
    'TinyLlama/TinyLlama-1.1B-Chat-v1.0',
    'TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF',
    'tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf',
)

config = get_config(model.creator_hf_repo)

messages = [
    {'role': 'system', 'content': 'You are a helpful assistant.'},
    {'role': 'user', 'content': 'Evaluate 1 + 2 in Python.'},
]

options = Options(
    ctx_size=config.max_position_embeddings,
    predict=-2,
    model=model,
    prompt=messages,
)

for chunk in llama_generate(options):
    print(chunk, flush=True, end='')

# newline
print()

Demos

#
# run demos
#
python -B examples/demo_cffi_cpu.py
python -B examples/demo_cffi_cuda_12_5.py

python -B examples/demo_ctypes_cpu.py
python -B examples/demo_ctypes_cuda_12_5.py

# python -m http.server -d examples/demo_pyonide -b "0.0.0.0" 5000

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

llama_cpp_cffi-0.1.1-cp312-cp312-musllinux_1_2_x86_64.whl (2.6 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ x86-64

llama_cpp_cffi-0.1.1-cp312-cp312-musllinux_1_2_aarch64.whl (2.3 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ ARM64

llama_cpp_cffi-0.1.1-cp312-cp312-manylinux_2_17_x86_64.whl (34.5 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

llama_cpp_cffi-0.1.1-cp312-cp312-manylinux_2_17_aarch64.whl (2.5 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

llama_cpp_cffi-0.1.1-cp311-cp311-musllinux_1_2_x86_64.whl (2.6 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ x86-64

llama_cpp_cffi-0.1.1-cp311-cp311-musllinux_1_2_aarch64.whl (2.3 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ ARM64

llama_cpp_cffi-0.1.1-cp311-cp311-manylinux_2_17_x86_64.whl (34.5 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

llama_cpp_cffi-0.1.1-cp311-cp311-manylinux_2_17_aarch64.whl (2.5 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

llama_cpp_cffi-0.1.1-cp310-cp310-musllinux_1_2_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ x86-64

llama_cpp_cffi-0.1.1-cp310-cp310-musllinux_1_2_aarch64.whl (2.3 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ ARM64

llama_cpp_cffi-0.1.1-cp310-cp310-manylinux_2_17_x86_64.whl (34.5 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

llama_cpp_cffi-0.1.1-cp310-cp310-manylinux_2_17_aarch64.whl (2.5 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page