Skip to main content

Client for LlamaCpp HTTP Server

Project description

llamacpp_client

llamacpp_client is a Python client library designed to simplify communication with the LlamaCpp HTTP server (as provided by the official Docker image). It provides both synchronous and asynchronous interfaces for interacting with LlamaCpp's completion and chat endpoints.

Features

  • Synchronous and Asynchronous API:
    • LlamaCppClient.completion: Synchronous API for blocking calls to /completion endpoint on LlamaCpp server.
    • LlamaCppClient.v1_chat_completions: Synchronous API for synchronous, blocking calls, to OpenAI /v1/chat/completions endpoint on LlamaCpp server (or any other OpenAI compatible server).
    • LlamaCppClient.async_completion: Asynchronous API for non-blocking, streaming responses to /completion endpoint on LlamaCpp server.
    • LlamaCppClient.async_v1_chat_completions: Asynchronous API for non-blocking, streaming responses to OpenAI /v1/chat/completions endpoint on LlamaCpp server (or any other OpenAI compatible server).
  • Multiple Endpoints Support: Load-balance requests across multiple LlamaCpp server endpoints.
  • Easy Configuration: Use LlamaCppServerAddress to specify server host and port.

Installation

pip install .

Or add to your pyproject.toml dependencies:

llamacpp_client = { path = "path/to/llamacpp_client" }

Usage

Synchronous example:

from llamacpp_client import LlamaCppClient, LlamaCppServerAddress

endpoints = [
    LlamaCppServerAddress(host="localhost", port=8080)
]
client = LlamaCppClient(endpoints)
response = client.completion(prompt="Hello, world!")
print(response)

Asynchronous example:

import asyncio
from llamacpp_client import LlamaCppClient, LlamaCppServerAddress

async def main():
    endpoints = [
        LlamaCppServerAddress(host="localhost", port=8080)
    ]
    client = LlamaCppClient(endpoints)
    async for chunk in client.async_completion(prompt="Hello, world!"):
        print(chunk.decode(), end="")

asyncio.run(main())

Requirements

License

MIT

-- Developed by Luís Gomes luismsgomes@gmail.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llamacpp_client-0.2.1.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llamacpp_client-0.2.1-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file llamacpp_client-0.2.1.tar.gz.

File metadata

  • Download URL: llamacpp_client-0.2.1.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for llamacpp_client-0.2.1.tar.gz
Algorithm Hash digest
SHA256 ea94c156520fe703cdef58a11914d3f93a9802edd767081fc9d551fee809d4c1
MD5 d55a8ed4d66b11de8927a64240525ab5
BLAKE2b-256 16516629fe42d25edb84b871fac3729b262c86489f4f05c80b090d7ede9f4aa6

See more details on using hashes here.

File details

Details for the file llamacpp_client-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llamacpp_client-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3039e5c2c66f07022a6a5857086c6bdd2b643ce8c28b2f1e474be2ff0df9af05
MD5 224c25094f4b02e58ee41eff976d5ba8
BLAKE2b-256 3e41e65e9ad6ca38e59045cb2ab44de01e10356e253b080afc14909a45638350

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page