Skip to main content

Client for LlamaCpp HTTP Server

Project description

llamacpp_client

llamacpp_client is a Python client library designed to simplify communication with the LlamaCpp HTTP server (as provided by the official Docker image). It provides both synchronous and asynchronous interfaces for interacting with LlamaCpp's completion and chat endpoints.

Features

  • Synchronous and Asynchronous API:
    • LlamaCppClient.completion: Synchronous API for blocking calls to /completion endpoint on LlamaCpp server.
    • LlamaCppClient.v1_chat_completions: Synchronous API for synchronous, blocking calls, to OpenAI /v1/chat/completions endpoint on LlamaCpp server (or any other OpenAI compatible server).
    • LlamaCppClient.async_completion: Asynchronous API for non-blocking, streaming responses to /completion endpoint on LlamaCpp server.
    • LlamaCppClient.async_v1_chat_completions: Asynchronous API for non-blocking, streaming responses to OpenAI /v1/chat/completions endpoint on LlamaCpp server (or any other OpenAI compatible server).
  • Multiple Endpoints Support: Load-balance requests across multiple LlamaCpp server endpoints.
  • Easy Configuration: Use LlamaCppServerAddress to specify server host and port.

Installation

pip install .

Or add to your pyproject.toml dependencies:

llamacpp_client = { path = "path/to/llamacpp_client" }

Usage

Synchronous example:

from llamacpp_client import LlamaCppClient, LlamaCppServerAddress

endpoints = [
    LlamaCppServerAddress(host="localhost", port=8080)
]
client = LlamaCppClient(endpoints)
response = client.completion(prompt="Hello, world!")
print(response)

Asynchronous example:

import asyncio
from llamacpp_client import LlamaCppClient, LlamaCppServerAddress

async def main():
    endpoints = [
        LlamaCppServerAddress(host="localhost", port=8080)
    ]
    client = LlamaCppClient(endpoints)
    async for chunk in client.async_completion(prompt="Hello, world!"):
        print(chunk.decode(), end="")

asyncio.run(main())

Requirements

License

MIT

-- Developed by Luís Gomes luismsgomes@gmail.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llamacpp_client-0.1.3.tar.gz (2.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llamacpp_client-0.1.3-py3-none-any.whl (3.1 kB view details)

Uploaded Python 3

File details

Details for the file llamacpp_client-0.1.3.tar.gz.

File metadata

  • Download URL: llamacpp_client-0.1.3.tar.gz
  • Upload date:
  • Size: 2.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for llamacpp_client-0.1.3.tar.gz
Algorithm Hash digest
SHA256 57b174e538e566e2f0a0f6ce0663bc531ef162667ab8bb01973fb1aee862923e
MD5 c364c24f8816123f6b4563b13b5cba35
BLAKE2b-256 84c3a9cc0d68073c257d52cd2a1d878c62859c642f63cac8405e60e6a34cf833

See more details on using hashes here.

File details

Details for the file llamacpp_client-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for llamacpp_client-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 671dc7d5635d0d4b7f3e8d70781b40985d079cf6255cce92529227adc4cfc35b
MD5 708a620e0d04fef29938c3c4eaa9d7e9
BLAKE2b-256 d5cd66e2cfd82ce370fb16203ca1d5bc3438044ea1f1e7c84d88bb2738dd2309

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page