LLM plugin providing access to llama.cpp server models

These details have not been verified by PyPI

Project links

Project description

llm-llamacpp

A plugin for LLM providing access to models running on a llama.cpp server.

Installation

Install this plugin in the same environment as LLM:

llm install llm-llamacpp-plugin

Setup

First, you need to have a llama.cpp server running. You can start one using the llama.cpp server binary:

# Download and build llama.cpp if you haven't already
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp

# Build the server
make

# Start the server with your model
./build/bin/server -m models/your-model.gguf -c 4096

The server will start on http://localhost:8080 by default.

Usage

Once the plugin is installed and your llama.cpp server is running, you can use it like any other LLM model:

llm -m llamacpp "Your prompt here"

Using a different server URL

If your llama.cpp server is running on a different host or port, you can set the LLM_LLAMACPP_SERVER environment variable:

export LLM_LLAMACPP_SERVER=http://your-server:port

Conversations

You can use conversations just like with other models:

llm -m llamacpp "First message"
llm -c "Follow-up question"

Options

The plugin supports various generation options:

# Set temperature
llm -m llamacpp "Your prompt" --temperature 0.9

# Limit max tokens
llm -m llamacpp "Your prompt" --max-tokens 500

# Set top-p sampling
llm -m llamacpp "Your prompt" --top-p 0.9

# Use a specific seed for reproducible results
llm -m llamacpp "Your prompt" --seed 42

# Adjust repeat penalty
llm -m llamacpp "Your prompt" --repeat-penalty 1.2

JSON Schema

You can request JSON output using LLM's schema feature:

llm -m llamacpp "Generate a person" --schema '{"name": "string", "age": "integer"}'

Vision Models

If you're running a vision-capable llama.cpp model with multimodal support, the plugin can handle image attachments.

Embedding Models

The plugin also supports embedding models running on llama.cpp server. To use embeddings:

# Start the server with embedding support
./build/bin/server -m models/embedding-model.gguf --embedding

Then use it with LLM:

# Get embeddings for text
llm embed -m llamacpp-embed "Hello world"

# Get embeddings for multiple items
llm embed -m llamacpp-embed "First text" "Second text"

Development

To install the plugin for development:

cd llm-llamacpp
pip install -e .

Run the tests:

pytest

License

Apache 2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

May 10, 2026

0.1.1

May 9, 2026

This version

0.1

Apr 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_llamacpp_plugin-0.1.tar.gz (9.1 kB view details)

Uploaded Apr 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_llamacpp_plugin-0.1-py3-none-any.whl (7.0 kB view details)

Uploaded Apr 5, 2026 Python 3

File details

Details for the file llm_llamacpp_plugin-0.1.tar.gz.

File metadata

Download URL: llm_llamacpp_plugin-0.1.tar.gz
Upload date: Apr 5, 2026
Size: 9.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llm_llamacpp_plugin-0.1.tar.gz
Algorithm	Hash digest
SHA256	`70e5ff23a921ecf6c3a9dd06a1ab277f1ff36a54f11f03cb6923a6796ebb1e2b`
MD5	`994e8a913087635513bf67ce3aaf0623`
BLAKE2b-256	`5cd78d4a75132be5801d25c98880f32dca624cba6c5b554ebeed31438119e22b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_llamacpp_plugin-0.1.tar.gz:

Publisher: publish.yml on sukhbinder/llm-llamacpp-plugin

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_llamacpp_plugin-0.1.tar.gz
- Subject digest: 70e5ff23a921ecf6c3a9dd06a1ab277f1ff36a54f11f03cb6923a6796ebb1e2b
- Sigstore transparency entry: 1239357724
- Sigstore integration time: Apr 5, 2026
Source repository:
- Permalink: sukhbinder/llm-llamacpp-plugin@777100bfb200f953fce9d28bfca6abd1dd781e85
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/sukhbinder
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@777100bfb200f953fce9d28bfca6abd1dd781e85
- Trigger Event: release

File details

Details for the file llm_llamacpp_plugin-0.1-py3-none-any.whl.

File metadata

Download URL: llm_llamacpp_plugin-0.1-py3-none-any.whl
Upload date: Apr 5, 2026
Size: 7.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llm_llamacpp_plugin-0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`21cc7c0630d73dfaac304da3ba8ae03aec5cf74abb157d0a7fe625b25262685a`
MD5	`ef930669ca402ee1948f284c9353a7d8`
BLAKE2b-256	`a993dd56546637c70eef0332077f641520e12fd97af13d89b6a244804bf03da4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_llamacpp_plugin-0.1-py3-none-any.whl:

Publisher: publish.yml on sukhbinder/llm-llamacpp-plugin

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_llamacpp_plugin-0.1-py3-none-any.whl
- Subject digest: 21cc7c0630d73dfaac304da3ba8ae03aec5cf74abb157d0a7fe625b25262685a
- Sigstore transparency entry: 1239357729
- Sigstore integration time: Apr 5, 2026
Source repository:
- Permalink: sukhbinder/llm-llamacpp-plugin@777100bfb200f953fce9d28bfca6abd1dd781e85
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/sukhbinder
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@777100bfb200f953fce9d28bfca6abd1dd781e85
- Trigger Event: release

llm-llamacpp-plugin 0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llm-llamacpp

Installation

Setup

Usage

Using a different server URL

Conversations

Options

JSON Schema

Vision Models

Embedding Models

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance