Skip to main content

Python bindings for the llama.cpp library

Project description

🦙 Python Bindings for llama.cpp

Documentation Status Tests PyPI PyPI - Python Version PyPI - License PyPI - Downloads Github All Releases

Simple Python bindings for @ggerganov's llama.cpp library. This package provides:

Documentation is available at https://llama-cpp-python.readthedocs.io/en/latest.

Installation

Requirements:

  • Python 3.8+
  • C compiler
    • Linux: gcc or clang
    • Windows: Visual Studio or MinGW
    • MacOS: Xcode

To install the package, run:

pip install llama-cpp-python

This will also build llama.cpp from source and install it alongside this python package.

If this fails, add --verbose to the pip install see the full cmake build log.

Pre-built Wheel (New)

It is also possible to install a pre-built wheel with basic CPU support.

pip install llama-cpp-python \
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu

Installation Configuration

llama.cpp supports a number of hardware acceleration backends to speed up inference as well as backend specific options. See the llama.cpp README for a full list.

All llama.cpp cmake build options can be set via the CMAKE_ARGS environment variable or via the --config-settings / -C cli flag during installation.

Environment Variables
# Linux and Mac
CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" \
  pip install llama-cpp-python
# Windows
$env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS"
pip install llama-cpp-python
CLI / requirements.txt

They can also be set via pip install -C / --config-settings command and saved to a requirements.txt file:

pip install --upgrade pip # ensure pip is up to date
pip install llama-cpp-python \
  -C cmake.args="-DLLAMA_BLAS=ON;-DLLAMA_BLAS_VENDOR=OpenBLAS"
# requirements.txt

llama-cpp-python -C cmake.args="-DLLAMA_BLAS=ON;-DLLAMA_BLAS_VENDOR=OpenBLAS"

Supported Backends

Below are some common backends, their build commands and any additional environment variables required.

OpenBLAS (CPU)

To install with OpenBLAS, set the LLAMA_BLAS and LLAMA_BLAS_VENDOR environment variables before installing:

CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python
CUDA

To install with CUDA support, set the LLAMA_CUDA=on environment variable before installing:

CMAKE_ARGS="-DLLAMA_CUDA=on" pip install llama-cpp-python

Pre-built Wheel (New)

It is also possible to install a pre-built wheel with CUDA support. As long as your system meets some requirements:

  • CUDA Version is 12.1, 12.2 or 12.3
  • Python Version is 3.10, 3.11 or 3.12
pip install llama-cpp-python \
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/<cuda-version>

Where <cuda-version> is one of the following:

  • cu121: CUDA 12.1
  • cu122: CUDA 12.2
  • cu123: CUDA 12.3

For example, to install the CUDA 12.1 wheel:

pip install llama-cpp-python \
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu121
Metal

To install with Metal (MPS), set the LLAMA_METAL=on environment variable before installing:

CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python

Pre-built Wheel (New)

It is also possible to install a pre-built wheel with Metal support. As long as your system meets some requirements:

  • MacOS Version is 11.0 or later
  • Python Version is 3.10, 3.11 or 3.12
pip install llama-cpp-python \
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/metal
CLBlast (OpenCL)

To install with CLBlast, set the LLAMA_CLBLAST=on environment variable before installing:

CMAKE_ARGS="-DLLAMA_CLBLAST=on" pip install llama-cpp-python
hipBLAS (ROCm)

To install with hipBLAS / ROCm support for AMD cards, set the LLAMA_HIPBLAS=on environment variable before installing:

CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python
Vulkan

To install with Vulkan support, set the LLAMA_VULKAN=on environment variable before installing:

CMAKE_ARGS="-DLLAMA_VULKAN=on" pip install llama-cpp-python
Kompute

To install with Kompute support, set the LLAMA_KOMPUTE=on environment variable before installing:

CMAKE_ARGS="-DLLAMA_KOMPUTE=on" pip install llama-cpp-python
SYCL

To install with SYCL support, set the LLAMA_SYCL=on environment variable before installing:

source /opt/intel/oneapi/setvars.sh   
CMAKE_ARGS="-DLLAMA_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install llama-cpp-python

Windows Notes

Error: Can't find 'nmake' or 'CMAKE_C_COMPILER'

If you run into issues where it complains it can't find 'nmake' '?' or CMAKE_C_COMPILER, you can extract w64devkit as mentioned in llama.cpp repo and add those manually to CMAKE_ARGS before running pip install:

$env:CMAKE_GENERATOR = "MinGW Makefiles"
$env:CMAKE_ARGS = "-DLLAMA_OPENBLAS=on -DCMAKE_C_COMPILER=C:/w64devkit/bin/gcc.exe -DCMAKE_CXX_COMPILER=C:/w64devkit/bin/g++.exe"

See the above instructions and set CMAKE_ARGS to the BLAS backend you want to use.

MacOS Notes

Detailed MacOS Metal GPU install documentation is available at docs/install/macos.md

M1 Mac Performance Issue

Note: If you are using Apple Silicon (M1) Mac, make sure you have installed a version of Python that supports arm64 architecture. For example:

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
bash Miniforge3-MacOSX-arm64.sh

Otherwise, while installing it will build the llama.cpp x86 version which will be 10x slower on Apple Silicon (M1) Mac.

M Series Mac Error: `(mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64'))`

Try installing with

CMAKE_ARGS="-DCMAKE_OSX_ARCHITECTURES=arm64 -DCMAKE_APPLE_SILICON_PROCESSOR=arm64 -DLLAMA_METAL=on" pip install --upgrade --verbose --force-reinstall --no-cache-dir llama-cpp-python

Upgrading and Reinstalling

To upgrade and rebuild llama-cpp-python add --upgrade --force-reinstall --no-cache-dir flags to the pip install command to ensure the package is rebuilt from source.

High-level API

API Reference

The high-level API provides a simple managed interface through the Llama class.

Below is a short example demonstrating how to use the high-level API to for basic text completion:

>>> from llama_cpp import Llama
>>> llm = Llama(
      model_path="./models/7B/llama-model.gguf",
      # n_gpu_layers=-1, # Uncomment to use GPU acceleration
      # seed=1337, # Uncomment to set a specific seed
      # n_ctx=2048, # Uncomment to increase the context window
)
>>> output = llm(
      "Q: Name the planets in the solar system? A: ", # Prompt
      max_tokens=32, # Generate up to 32 tokens, set to None to generate up to the end of the context window
      stop=["Q:", "\n"], # Stop generating just before the model would generate a new question
      echo=True # Echo the prompt back in the output
) # Generate a completion, can also call create_completion
>>> print(output)
{
  "id": "cmpl-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "object": "text_completion",
  "created": 1679561337,
  "model": "./models/7B/llama-model.gguf",
  "choices": [
    {
      "text": "Q: Name the planets in the solar system? A: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune and Pluto.",
      "index": 0,
      "logprobs": None,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 28,
    "total_tokens": 42
  }
}

Text completion is available through the __call__ and create_completion methods of the Llama class.

Pulling models from Hugging Face Hub

You can download Llama models in gguf format directly from Hugging Face using the from_pretrained method. You'll need to install the huggingface-hub package to use this feature (pip install huggingface-hub).

llm = Llama.from_pretrained(
    repo_id="Qwen/Qwen1.5-0.5B-Chat-GGUF",
    filename="*q8_0.gguf",
    verbose=False
)

By default from_pretrained will download the model to the huggingface cache directory, you can then manage installed model files with the huggingface-cli tool.

Chat Completion

The high-level API also provides a simple interface for chat completion.

Chat completion requires that the model knows how to format the messages into a single prompt. The Llama class does this using pre-registered chat formats (ie. chatml, llama-2, gemma, etc) or by providing a custom chat handler object.

The model will will format the messages into a single prompt using the following order of precedence:

  • Use the chat_handler if provided
  • Use the chat_format if provided
  • Use the tokenizer.chat_template from the gguf model's metadata (should work for most new models, older models may not have this)
  • else, fallback to the llama-2 chat format

Set verbose=True to see the selected chat format.

>>> from llama_cpp import Llama
>>> llm = Llama(
      model_path="path/to/llama-2/llama-model.gguf",
      chat_format="llama-2"
)
>>> llm.create_chat_completion(
      messages = [
          {"role": "system", "content": "You are an assistant who perfectly describes images."},
          {
              "role": "user",
              "content": "Describe this image in detail please."
          }
      ]
)

Chat completion is available through the create_chat_completion method of the Llama class.

For OpenAI API v1 compatibility, you use the create_chat_completion_openai_v1 method which will return pydantic models instead of dicts.

JSON and JSON Schema Mode

To constrain chat responses to only valid JSON or a specific JSON Schema use the response_format argument in create_chat_completion.

JSON Mode

The following example will constrain the response to valid JSON strings only.

>>> from llama_cpp import Llama
>>> llm = Llama(model_path="path/to/model.gguf", chat_format="chatml")
>>> llm.create_chat_completion(
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant that outputs in JSON.",
        },
        {"role": "user", "content": "Who won the world series in 2020"},
    ],
    response_format={
        "type": "json_object",
    },
    temperature=0.7,
)

JSON Schema Mode

To constrain the response further to a specific JSON Schema add the schema to the schema property of the response_format argument.

>>> from llama_cpp import Llama
>>> llm = Llama(model_path="path/to/model.gguf", chat_format="chatml")
>>> llm.create_chat_completion(
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant that outputs in JSON.",
        },
        {"role": "user", "content": "Who won the world series in 2020"},
    ],
    response_format={
        "type": "json_object",
        "schema": {
            "type": "object",
            "properties": {"team_name": {"type": "string"}},
            "required": ["team_name"],
        },
    },
    temperature=0.7,
)

Function Calling

The high-level API supports OpenAI compatible function and tool calling. This is possible through the functionary pre-trained models chat format or through the generic chatml-function-calling chat format.

>>> from llama_cpp import Llama
>>> llm = Llama(model_path="path/to/chatml/llama-model.gguf", chat_format="chatml-function-calling")
>>> llm.create_chat_completion(
      messages = [
        {
          "role": "system",
          "content": "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. The assistant calls functions with appropriate input when necessary"

        },
        {
          "role": "user",
          "content": "Extract Jason is 25 years old"
        }
      ],
      tools=[{
        "type": "function",
        "function": {
          "name": "UserDetail",
          "parameters": {
            "type": "object",
            "title": "UserDetail",
            "properties": {
              "name": {
                "title": "Name",
                "type": "string"
              },
              "age": {
                "title": "Age",
                "type": "integer"
              }
            },
            "required": [ "name", "age" ]
          }
        }
      }],
      tool_choice={
        "type": "function",
        "function": {
          "name": "UserDetail"
        }
      }
)
Functionary v2

The various gguf-converted files for this set of models can be found here. Functionary is able to intelligently call functions and also analyze any provided function outputs to generate coherent responses. All v2 models of functionary supports parallel function calling. You can provide either functionary-v1 or functionary-v2 for the chat_format when initializing the Llama class.

Due to discrepancies between llama.cpp and HuggingFace's tokenizers, it is required to provide HF Tokenizer for functionary. The LlamaHFTokenizer class can be initialized and passed into the Llama class. This will override the default llama.cpp tokenizer used in Llama class. The tokenizer files are already included in the respective HF repositories hosting the gguf files.

>>> from llama_cpp import Llama
>>> from llama_cpp.llama_tokenizer import LlamaHFTokenizer
>>> llm = Llama.from_pretrained(
  repo_id="meetkai/functionary-small-v2.2-GGUF",
  filename="functionary-small-v2.2.q4_0.gguf",
  chat_format="functionary-v2",
  tokenizer=LlamaHFTokenizer.from_pretrained("meetkai/functionary-small-v2.2-GGUF")
)

Multi-modal Models

llama-cpp-python supports the llava1.5 family of multi-modal models which allow the language model to read information from both text and images.

You'll first need to download one of the available multi-modal models in GGUF format:

Then you'll need to use a custom chat handler to load the clip model and process the chat messages and images.

>>> from llama_cpp import Llama
>>> from llama_cpp.llama_chat_format import Llava15ChatHandler
>>> chat_handler = Llava15ChatHandler(clip_model_path="path/to/llava/mmproj.bin")
>>> llm = Llama(
  model_path="./path/to/llava/llama-model.gguf",
  chat_handler=chat_handler,
  n_ctx=2048, # n_ctx should be increased to accomodate the image embedding
  logits_all=True,# needed to make llava work
)
>>> llm.create_chat_completion(
    messages = [
        {"role": "system", "content": "You are an assistant who perfectly describes images."},
        {
            "role": "user",
            "content": [
                {"type": "image_url", "image_url": {"url": "https://.../image.png"}},
                {"type" : "text", "text": "Describe this image in detail please."}
            ]
        }
    ]
)
Loading a Local Image

Images can be passed as base64 encoded data URIs. The following example demonstrates how to do this.

import base64

def image_to_base64_data_uri(file_path):
    with open(file_path, "rb") as img_file:
        base64_data = base64.b64encode(img_file.read()).decode('utf-8')
        return f"data:image/png;base64,{base64_data}"

# Replace 'file_path.png' with the actual path to your PNG file
file_path = 'file_path.png'
data_uri = image_to_base64_data_uri(file_path)

messages = [
    {"role": "system", "content": "You are an assistant who perfectly describes images."},
    {
        "role": "user",
        "content": [
            {"type": "image_url", "image_url": {"url": data_uri }},
            {"type" : "text", "text": "Describe this image in detail please."}
        ]
    }
]

Speculative Decoding

llama-cpp-python supports speculative decoding which allows the model to generate completions based on a draft model.

The fastest way to use speculative decoding is through the LlamaPromptLookupDecoding class.

Just pass this as a draft model to the Llama class during initialization.

from llama_cpp import Llama
from llama_cpp.llama_speculative import LlamaPromptLookupDecoding

llama = Llama(
    model_path="path/to/model.gguf",
    draft_model=LlamaPromptLookupDecoding(num_pred_tokens=10) # num_pred_tokens is the number of tokens to predict 10 is the default and generally good for gpu, 2 performs better for cpu-only machines.
)

Embeddings

To generate text embeddings use create_embedding.

import llama_cpp

llm = llama_cpp.Llama(model_path="path/to/model.gguf", embedding=True)

embeddings = llm.create_embedding("Hello, world!")

# or create multiple embeddings at once

embeddings = llm.create_embedding(["Hello, world!", "Goodbye, world!"])

Adjusting the Context Window

The context window of the Llama models determines the maximum number of tokens that can be processed at once. By default, this is set to 512 tokens, but can be adjusted based on your requirements.

For instance, if you want to work with larger contexts, you can expand the context window by setting the n_ctx parameter when initializing the Llama object:

llm = Llama(model_path="./models/7B/llama-model.gguf", n_ctx=2048)

OpenAI Compatible Web Server

llama-cpp-python offers a web server which aims to act as a drop-in replacement for the OpenAI API. This allows you to use llama.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).

To install the server package and get started:

pip install 'llama-cpp-python[server]'
python3 -m llama_cpp.server --model models/7B/llama-model.gguf

Similar to Hardware Acceleration section above, you can also install with GPU (cuBLAS) support like this:

CMAKE_ARGS="-DLLAMA_CUDA=on" FORCE_CMAKE=1 pip install 'llama-cpp-python[server]'
python3 -m llama_cpp.server --model models/7B/llama-model.gguf --n_gpu_layers 35

Navigate to http://localhost:8000/docs to see the OpenAPI documentation.

To bind to 0.0.0.0 to enable remote connections, use python3 -m llama_cpp.server --host 0.0.0.0. Similarly, to change the port (default is 8000), use --port.

You probably also want to set the prompt format. For chatml, use

python3 -m llama_cpp.server --model models/7B/llama-model.gguf --chat_format chatml

That will format the prompt according to how model expects it. You can find the prompt format in the model card. For possible options, see llama_cpp/llama_chat_format.py and look for lines starting with "@register_chat_format".

If you have huggingface-hub installed, you can also use the --hf_model_repo_id flag to load a model from the Hugging Face Hub.

python3 -m llama_cpp.server --hf_model_repo_id Qwen/Qwen1.5-0.5B-Chat-GGUF --model '*q8_0.gguf'

Web Server Features

Docker image

A Docker image is available on GHCR. To run the server:

docker run --rm -it -p 8000:8000 -v /path/to/models:/models -e MODEL=/models/llama-model.gguf ghcr.io/abetlen/llama-cpp-python:latest

Docker on termux (requires root) is currently the only known way to run this on phones, see termux support issue

Low-level API

API Reference

The low-level API is a direct ctypes binding to the C API provided by llama.cpp. The entire low-level API can be found in llama_cpp/llama_cpp.py and directly mirrors the C API in llama.h.

Below is a short example demonstrating how to use the low-level API to tokenize a prompt:

>>> import llama_cpp
>>> import ctypes
>>> llama_cpp.llama_backend_init(False) # Must be called once at the start of each program
>>> params = llama_cpp.llama_context_default_params()
# use bytes for char * params
>>> model = llama_cpp.llama_load_model_from_file(b"./models/7b/llama-model.gguf", params)
>>> ctx = llama_cpp.llama_new_context_with_model(model, params)
>>> max_tokens = params.n_ctx
# use ctypes arrays for array params
>>> tokens = (llama_cpp.llama_token * int(max_tokens))()
>>> n_tokens = llama_cpp.llama_tokenize(ctx, b"Q: Name the planets in the solar system? A: ", tokens, max_tokens, llama_cpp.c_bool(True))
>>> llama_cpp.llama_free(ctx)

Check out the examples folder for more examples of using the low-level API.

Documentation

Documentation is available via https://llama-cpp-python.readthedocs.io/. If you find any issues with the documentation, please open an issue or submit a PR.

Development

This package is under active development and I welcome any contributions.

To get started, clone the repository and install the package in editable / development mode:

git clone --recurse-submodules https://github.com/abetlen/llama-cpp-python.git
cd llama-cpp-python

# Upgrade pip (required for editable mode)
pip install --upgrade pip

# Install with pip
pip install -e .

# if you want to use the fastapi / openapi server
pip install -e .[server]

# to install all optional dependencies
pip install -e .[all]

# to clear the local build cache
make clean

You can also test out specific commits of lama.cpp by checking out the desired commit in the vendor/llama.cpp submodule and then running make clean and pip install -e . again. Any changes in the llama.h API will require changes to the llama_cpp/llama_cpp.py file to match the new API (additional changes may be required elsewhere).

FAQ

Are there pre-built binaries / binary wheels available?

The recommended installation method is to install from source as described above. The reason for this is that llama.cpp is built with compiler optimizations that are specific to your system. Using pre-built binaries would require disabling these optimizations or supporting a large number of pre-built binaries for each platform.

That being said there are some pre-built binaries available through the Releases as well as some community provided wheels.

In the future, I would like to provide pre-built binaries and wheels for common platforms and I'm happy to accept any useful contributions in this area. This is currently being tracked in #741

How does this compare to other Python bindings of llama.cpp?

I originally wrote this package for my own use with two goals in mind:

  • Provide a simple process to install llama.cpp and access the full C API in llama.h from Python
  • Provide a high-level Python API that can be used as a drop-in replacement for the OpenAI API so existing apps can be easily ported to use llama.cpp

Any contributions and changes to this package will be made with these goals in mind.

License

This project is licensed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

llama_cpp_python_cross-0.2.60-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl (15.6 MB view details)

Uploaded PyPy manylinux: glibc 2.28+ x86-64

llama_cpp_python_cross-0.2.60-pp310-pypy310_pp73-manylinux_2_28_aarch64.whl (8.3 MB view details)

Uploaded PyPy manylinux: glibc 2.28+ ARM64

llama_cpp_python_cross-0.2.60-pp39-pypy39_pp73-manylinux_2_28_x86_64.whl (15.6 MB view details)

Uploaded PyPy manylinux: glibc 2.28+ x86-64

llama_cpp_python_cross-0.2.60-pp39-pypy39_pp73-manylinux_2_28_aarch64.whl (8.3 MB view details)

Uploaded PyPy manylinux: glibc 2.28+ ARM64

llama_cpp_python_cross-0.2.60-pp38-pypy38_pp73-manylinux_2_28_x86_64.whl (15.6 MB view details)

Uploaded PyPy manylinux: glibc 2.28+ x86-64

llama_cpp_python_cross-0.2.60-pp38-pypy38_pp73-manylinux_2_28_aarch64.whl (8.3 MB view details)

Uploaded PyPy manylinux: glibc 2.28+ ARM64

llama_cpp_python_cross-0.2.60-cp312-cp312-manylinux_2_28_x86_64.whl (15.6 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.28+ x86-64

llama_cpp_python_cross-0.2.60-cp312-cp312-manylinux_2_28_aarch64.whl (8.3 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.28+ ARM64

llama_cpp_python_cross-0.2.60-cp312-cp312-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

llama_cpp_python_cross-0.2.60-cp312-cp312-macosx_10_9_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12 macOS 10.9+ x86-64

llama_cpp_python_cross-0.2.60-cp311-cp311-manylinux_2_28_x86_64.whl (15.6 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.28+ x86-64

llama_cpp_python_cross-0.2.60-cp311-cp311-manylinux_2_28_aarch64.whl (8.3 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.28+ ARM64

llama_cpp_python_cross-0.2.60-cp311-cp311-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

llama_cpp_python_cross-0.2.60-cp311-cp311-macosx_10_9_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

llama_cpp_python_cross-0.2.60-cp310-cp310-manylinux_2_28_x86_64.whl (15.6 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.28+ x86-64

llama_cpp_python_cross-0.2.60-cp310-cp310-manylinux_2_28_aarch64.whl (8.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.28+ ARM64

llama_cpp_python_cross-0.2.60-cp310-cp310-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

llama_cpp_python_cross-0.2.60-cp310-cp310-macosx_10_9_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

llama_cpp_python_cross-0.2.60-cp39-cp39-manylinux_2_28_x86_64.whl (15.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.28+ x86-64

llama_cpp_python_cross-0.2.60-cp39-cp39-manylinux_2_28_aarch64.whl (8.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.28+ ARM64

llama_cpp_python_cross-0.2.60-cp39-cp39-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

llama_cpp_python_cross-0.2.60-cp39-cp39-macosx_10_9_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

llama_cpp_python_cross-0.2.60-cp38-cp38-manylinux_2_28_x86_64.whl (15.6 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.28+ x86-64

llama_cpp_python_cross-0.2.60-cp38-cp38-manylinux_2_28_aarch64.whl (8.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.28+ ARM64

llama_cpp_python_cross-0.2.60-cp38-cp38-macosx_11_0_arm64.whl (3.5 MB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

llama_cpp_python_cross-0.2.60-cp38-cp38-macosx_10_9_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

File details

Details for the file llama_cpp_python_cross-0.2.60-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp310-pypy310_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2abdc892c73c8293a503840291b99e42c09e8814fa17f26fc0684dfdaf2bed78
MD5 431bdf92b30fb77127b10bb6007530cd
BLAKE2b-256 0d6c86ecf2454806e022d3e16248edbb9d96c93ebce24d4c4829a8f03e6b6de2

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp310-pypy310_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp310-pypy310_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 21f5658e96964d3980c985f667f75085b9dbbf0b80e8f02b09fde8614fc963c7
MD5 f9fd4019d35a3ceb2267b6ba6b3dcf5a
BLAKE2b-256 0c587393859a8ea76cb8c49ba1ec9c1e231997785398f4bb6c80b169d8094f01

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp310-pypy310_pp73-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp310-pypy310_pp73-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 64989d142d032c400da018fd45892e55bdbe46f805b09a96426aa28fdafa5089
MD5 ef509cf2b1f4b5b4bd325c2bd321c24d
BLAKE2b-256 9db8ee78689afa4981e77d6f2acca54570e120681fb222cfbe4eca9cbd387dd2

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp310-pypy310_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp310-pypy310_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c136518544f7810a917111c14f03590c4b1f6660f73de72a81a549812fe395d4
MD5 38db6acc7880e626153cb106c9850861
BLAKE2b-256 719da9c2d0a2a4ccf913b8e0d70b78dffa055a02847b106e47101fa52184c71a

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp39-pypy39_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp39-pypy39_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 932ef2a8a9b694f12dc5afb4184e360095089abb36a8cc493ecef7f17abeba8b
MD5 bcf660803333487a3a0c45b684127d76
BLAKE2b-256 a19fe06085ba447137a09547aaa5bec5348e3fafff9ec36b2cb45dea285796b3

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp39-pypy39_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp39-pypy39_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 d76a1cabd7ea73ee66abd89711be93bb7348f34542046b75356e2af4ed0aa5dd
MD5 d27a7800a0bbcd1257b3bf1fe4ee2ae4
BLAKE2b-256 207da760cacd86f845b81131092fd2502c16cfa723a21e71a31960c2173c1fc0

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp39-pypy39_pp73-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp39-pypy39_pp73-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b49f901317f38de73865c2db8f137278be579aecf40369445d1b9fa90b8f00f8
MD5 99a90ee16fa0c578741fd8cf3fe0182e
BLAKE2b-256 79883329c15c3ce4fd0e1c27206d36d65dc8fef070f9ea011ed388cabc275aff

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp39-pypy39_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp39-pypy39_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 e2d766452c48b7c207322d84b76e4d6f39b567a54934ec797fa45f9bc621e3cf
MD5 a4042c4f981a7e63cb414bc23e8a86f8
BLAKE2b-256 6eca9dc3c05fd815077cac7bb050369ec8ff4e9bd99635c8cc02442c585e63a3

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp38-pypy38_pp73-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp38-pypy38_pp73-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3731f4f24572e5fa101d74bbdb5dcd909117746b128c329424d980343f670a3d
MD5 6844e1ead32c302387e5187becb6b732
BLAKE2b-256 b2b6ed9d71b1b536daf9cd92161719cbe1858faa4f181e7dc2f572946a14cd52

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp38-pypy38_pp73-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp38-pypy38_pp73-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c73824bfd5179b0c907552b4fc03f1731476e1969213abc76af584b9ee68aefb
MD5 d5e1f7a1874b6f29ed3b2ecc50295372
BLAKE2b-256 5da27e9f47ef09d23e115a4edad29aea5490b7b0f38250c2cc70c6518cd6abbd

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp38-pypy38_pp73-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp38-pypy38_pp73-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d4fb069199ad9a409b34eb9399ecfe7ee598f856c16124988e94aa012d4cbf34
MD5 8c663f20e8dcd5b285b50351bb03c50c
BLAKE2b-256 9b9d9309b321e7cbf7056410b9880510a80089b12c8d9927c16a5482be5b977d

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-pp38-pypy38_pp73-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-pp38-pypy38_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0834d28dc3d11d2765659097002a906e37b0bd0a78462f151e57ce02b769b5b3
MD5 84028ec1006359fa67665538b9b483c9
BLAKE2b-256 2f8f883b4da450186bd62c129dafb6f6c7f46074283818f3554e42a29c155595

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 263c290adc998a058aba7d22fc3f9abc1e418ac71a77c22dbf9b487dde1edd5c
MD5 e73bb0cf63554dd420b63221aa85093f
BLAKE2b-256 1dbf30f23283e1b88720fb963fe49111e2bf68ab428944f3943df1baa578acd8

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 40819bd370d0146f4442e6d7a90bdaf8b2be2a7fd30495cdc5ecd0de7ffe8d42
MD5 e97e56e669d226eec9dc30c9b4bfd070
BLAKE2b-256 7516c3d81e67c169bbdeb1a1cac121021409345bbd26c5fac5099a2cb971e158

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4950f48d0890927190c30a0625223e023283b4431b3d4c257e3c3e8edb9eeda1
MD5 cbbde13f6afedbfcf5a8cd3993d6dd5a
BLAKE2b-256 8c5c80fa20511b929f994859ece2029258fd098ce2721768078253f577396a12

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp312-cp312-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f7b8020a7bd25f066a9bf5ebc04386e7af358e0a00560cd27674206c747c9540
MD5 1926ec6c8678ef454241062c5bb61046
BLAKE2b-256 5d07fa5158fd75f6dc171d0c6ee9926e0706c5bedd3d75f07a00189c2256839c

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 23f4a69d92e1630375a0134ec5b25339419c316376a7f17bf76c9f81d5e3b7dd
MD5 d5ce096e8c56060b12c7818bf0a71f96
BLAKE2b-256 96690773bda7e615b501dfda0a6b729bc016cfb605b2d5007a90a9065bf4ec6b

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 4f3d48d2f85cda3828a04d3b056437038b09345bbfc614b6bb80074f75318637
MD5 caa9c10362afe9181ea4adca305c09c6
BLAKE2b-256 fc30b7a107ed086586c4ea03b79bd171879309218e384a9fcfa3e331e5fdb9a9

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b36466fae07d303ddf698fc0f8e89ca81f323af3223fb2b20296cf66aa07f252
MD5 16a25f0bb11fa7c8912669ae385737b1
BLAKE2b-256 14641ff60b617e4be93c33547574d247eabaa6eea637c4183f600c03e3c976de

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a15269cc897267c12ba7a1708a8366824f9c8d6e33c842047cdef110d1182150
MD5 76240c50fa0ffcec562d08cee15bdd81
BLAKE2b-256 ea6218a0116861ad210bd0573e2f829605c6cc739901afac945ca910c6a0c7f6

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b81cc89295d9b79df9e86b22bdd5fc923b5c099ac5eb9cf920490dcd1206cb32
MD5 b204f9cb16a24e95365918c831f02797
BLAKE2b-256 d49b710eeb53b5769c88d1f63ed8221f030133080162ae328c1e5cbbc84177b9

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 15bcb6e057d2267b54b2ffde5fd4494dd5d4ce332c1d1d0d9b65b9c7338357bc
MD5 46a32f4b7ce17e55b49067bfb50171d5
BLAKE2b-256 976567108cd99f077b5ebe3d163dc0d6b54eaeee5fefacab5426742829d89fda

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 02d458346800f0e4568162bfb4c42b6543200217383de525260ff936c9dc414d
MD5 4560b114ecc34889595c8f5a0f1e48f4
BLAKE2b-256 6068457bc851f8c4c5f0f18d255f5547d609a52a3978b7832402f8976ce06ffc

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d007627159b559bc3568e5eda4057b1a0a642e7ecad144f640a388c4699927e2
MD5 94c8a7cf3c61d29ab21d95a1b4244638
BLAKE2b-256 a5cf9a470d60b5228a6b7e3cafbce42f5a41e6bbdf8e4c839d989be785a7bc69

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6533ede3a83f463a6410b20b50edfa51c415176c8c458628eb57c45e14a56892
MD5 1e220d4456aa36f60805fb739f20c9c7
BLAKE2b-256 2c04872d7e56aa993321bd869b19515a216b2474aff081d8ec2af77b7c8c3b08

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp39-cp39-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp39-cp39-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 af0b7d95d754f70310f98f3e7024a307e139e7fe67030418cd36ed3ba3dd9d6e
MD5 356654070bc6cfc9cc14d50d23cc8a51
BLAKE2b-256 73d876805b7a45ee76926ca4b54d89613636c40d09382b9cb093404ce03cf2f8

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d0e90531ef5b6594272bdd1dbaa398524792a3644deefa0661a3feacec6572ad
MD5 29d87cb38ab4a80ff241f9a29d26904f
BLAKE2b-256 0a534d3be2d6d80496ebf2e4825514269c341deab344d530dd1da2dfed39ea43

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 4b2354297a254b89a5b424bfff3f6bf495309e7d3fd7db4a88c75d54bda6b33d
MD5 40937a0cf2047317ddb1e498b2f0eb43
BLAKE2b-256 a1f12ee3ca82ffe9626f4c58ba52b7238343b7081d12ecfeb9a6be9878d72c51

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp38-cp38-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp38-cp38-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 32ae7633a73f8ad442b4ee378711ad3bf20f362b27af8377981f1f9f2056e277
MD5 3d5c6e5839be65bf63e9e2f74516d730
BLAKE2b-256 1662216d7294275347d5da9305bd050614914ffc6442440c6d018ab3ac01d1ee

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp38-cp38-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp38-cp38-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 88810964d2fe8e33cc3f60b84cb31a28d491cee87f651b734e5d59c967e6057c
MD5 0b3b4cfcc153baf4113e3e9e62df5c3c
BLAKE2b-256 221bce2c0cd41a147c4821d7d53a351d461ce44dbb94733fbfcae5dc0fee9cc8

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 68d42b3091da389f5f9019bbfd9f7a9ef07646533efb46a82835cde020f7b599
MD5 85ec8f3e51815ccc4f6abecc77d9d93f
BLAKE2b-256 224034d3703634c5a0f6a5fb591de741fec60ed2ac212a57b8c2a512c5806362

See more details on using hashes here.

File details

Details for the file llama_cpp_python_cross-0.2.60-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for llama_cpp_python_cross-0.2.60-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 1928b3af04afe33224a06fc0a237fc5e8295e5b7a3a160c0f7334646b3c058d4
MD5 c7209c1482ae885253890a2823ea0384
BLAKE2b-256 7b12251b798fd03cabc1e1b14113fb55df5fc85914879b8ab95134c0b0134b4b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page