llama.cpp server binary built from source
Project description
llama-cpp-bin
Pre-built llama.cpp server binaries as a py package. Install a wheel for your platform and run it.
Install
Pre-built wheels (recommended)
pip install --index-url https://vladlearns.github.io/llama-cpp-bin/whl/cpu llama-cpp-bin
pip install --index-url https://vladlearns.github.io/llama-cpp-bin/whl/cu124 llama-cpp-bin
pip install --index-url https://vladlearns.github.io/llama-cpp-bin/whl/cu131 llama-cpp-bin
pip install --index-url https://vladlearns.github.io/llama-cpp-bin/whl/rocm llama-cpp-bin
pip install --index-url https://vladlearns.github.io/llama-cpp-bin/whl/vulkan llama-cpp-bin
PyPI (builds from source)
If no pre-built wheel matches your platform, pip falls back to building from the sdist on PyPI:
pip install llama-cpp-bin
You will need CMake, a c++ compiler, and the llama.cpp source submodule.
Dev
git clone --recurse-submodules https://github.com/vladlearns/llama-cpp-bin
cd llama-cpp-bin
CMAKE_ARGS="-DGGML_CUDA=ON" pip install -v .
Run
CLI:
llama-cpp-server -m your-model.gguf --port 8080
Python:
from llama_cpp_bin import run_server
proc = run_server("your-model.gguf", port=8080)
proc.wait()
Or get the binary path and run it yourself:
import llama_cpp_bin
import subprocess
binary = llama_cpp_bin.get_binary_path()
subprocess.Popen([binary, "--model", "your-model.gguf"])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file llama_cpp_bin-9095.0.0.tar.gz.
File metadata
- Download URL: llama_cpp_bin-9095.0.0.tar.gz
- Upload date:
- Size: 4.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39a66280eae69cb66b9eca5bc5da5f51a6756f6a41f6547cde9a52c501ff65bc
|
|
| MD5 |
e5efc6fd6432115fb23e1ec9ab0444c4
|
|
| BLAKE2b-256 |
85308a5adaa478e58c9aa9a02eaa8c5a7065a9b2d4ea8784b575bb50e65276a7
|
Provenance
The following attestation bundles were made for llama_cpp_bin-9095.0.0.tar.gz:
Publisher:
build-everything.yml on vladlearns/llama-cpp-bin
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llama_cpp_bin-9095.0.0.tar.gz -
Subject digest:
39a66280eae69cb66b9eca5bc5da5f51a6756f6a41f6547cde9a52c501ff65bc - Sigstore transparency entry: 1496008734
- Sigstore integration time:
-
Permalink:
vladlearns/llama-cpp-bin@fcb1273b18387d0097954490f604bda420bea974 -
Branch / Tag:
refs/tags/v9095.0.0 - Owner: https://github.com/vladlearns
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build-everything.yml@fcb1273b18387d0097954490f604bda420bea974 -
Trigger Event:
push
-
Statement type: