Skip to main content

Client for LlamaCpp HTTP Server

Project description

llamacpp_client

llamacpp_client is a Python client library designed to simplify communication with the LlamaCpp HTTP server (as provided by the official Docker image). It provides both synchronous and asynchronous interfaces for interacting with LlamaCpp's completion and chat endpoints.

Features

  • Synchronous and Asynchronous API:
    • LlamaCppClient.completion: Synchronous API for blocking calls to /completion endpoint on LlamaCpp server.
    • LlamaCppClient.v1_chat_completions: Synchronous API for synchronous, blocking calls, to OpenAI /v1/chat/completions endpoint on LlamaCpp server (or any other OpenAI compatible server).
    • LlamaCppClient.async_completion: Asynchronous API for non-blocking, streaming responses to /completion endpoint on LlamaCpp server.
    • LlamaCppClient.async_v1_chat_completions: Asynchronous API for non-blocking, streaming responses to OpenAI /v1/chat/completions endpoint on LlamaCpp server (or any other OpenAI compatible server).
  • Multiple Endpoints Support: Load-balance requests across multiple LlamaCpp server endpoints.
  • Easy Configuration: Use LlamaCppServerAddress to specify server host and port.

Installation

pip install .

Or add to your pyproject.toml dependencies:

llamacpp_client = { path = "path/to/llamacpp_client" }

Usage

Synchronous example:

from llamacpp_client import LlamaCppClient, LlamaCppServerAddress

endpoints = [
    LlamaCppServerAddress(host="localhost", port=8080)
]
client = LlamaCppClient(endpoints)
response = client.completion(prompt="Hello, world!")
print(response)

Asynchronous example:

import asyncio
from llamacpp_client import LlamaCppClient, LlamaCppServerAddress

async def main():
    endpoints = [
        LlamaCppServerAddress(host="localhost", port=8080)
    ]
    client = LlamaCppClient(endpoints)
    async for chunk in client.async_completion(prompt="Hello, world!"):
        print(chunk.decode(), end="")

asyncio.run(main())

Requirements

License

MIT

-- Developed by Luís Gomes luismsgomes@gmail.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llamacpp_client-0.2.2.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llamacpp_client-0.2.2-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file llamacpp_client-0.2.2.tar.gz.

File metadata

  • Download URL: llamacpp_client-0.2.2.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for llamacpp_client-0.2.2.tar.gz
Algorithm Hash digest
SHA256 bf72a1aa53a4286c64c4b4a7dd83831d157fdb88c8c2ab4a4ed01722fffa8428
MD5 a1c659ae5e343fec2502ad4f167d2db6
BLAKE2b-256 5fcbf24606866b71086672caab7c8a263a5cb400b8bd0a89e2611e18e501fd91

See more details on using hashes here.

File details

Details for the file llamacpp_client-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for llamacpp_client-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a52a88c9828cd16024e2f691adbc4765b01dbec9dbeb1fe5c9b50a5570cabec3
MD5 8fedfc8a884f083bb1d730eb7b34f03d
BLAKE2b-256 108005eabcd5d25784315637e6494c98bf8fd7128fc2eaf2416a7670cd419269

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page