Skip to main content

Client for LlamaCpp HTTP Server

Project description

llamacpp_client

llamacpp_client is a Python client library designed to simplify communication with the LlamaCpp HTTP server (as provided by the official Docker image). It provides both synchronous and asynchronous interfaces for interacting with LlamaCpp's completion and chat endpoints.

Features

  • Synchronous and Asynchronous API:
    • LlamaCppClient.completion: Synchronous API for blocking calls to /completion endpoint on LlamaCpp server.
    • LlamaCppClient.v1_chat_completions: Synchronous API for synchronous, blocking calls, to OpenAI /v1/chat/completions endpoint on LlamaCpp server (or any other OpenAI compatible server).
    • LlamaCppClient.async_completion: Asynchronous API for non-blocking, streaming responses to /completion endpoint on LlamaCpp server.
    • LlamaCppClient.async_v1_chat_completions: Asynchronous API for non-blocking, streaming responses to OpenAI /v1/chat/completions endpoint on LlamaCpp server (or any other OpenAI compatible server).
  • Multiple Endpoints Support: Load-balance requests across multiple LlamaCpp server endpoints.
  • Easy Configuration: Use LlamaCppServerAddress to specify server host and port.

Installation

pip install .

Or add to your pyproject.toml dependencies:

llamacpp_client = { path = "path/to/llamacpp_client" }

Usage

Synchronous example:

from llamacpp_client import LlamaCppClient, LlamaCppServerAddress

endpoints = [
    LlamaCppServerAddress(host="localhost", port=8080)
]
client = LlamaCppClient(endpoints)
response = client.completion(prompt="Hello, world!")
print(response)

Asynchronous example:

import asyncio
from llamacpp_client import LlamaCppClient, LlamaCppServerAddress

async def main():
    endpoints = [
        LlamaCppServerAddress(host="localhost", port=8080)
    ]
    client = LlamaCppClient(endpoints)
    async for chunk in client.async_completion(prompt="Hello, world!"):
        print(chunk.decode(), end="")

asyncio.run(main())

Requirements

License

MIT

-- Developed by Luís Gomes luismsgomes@gmail.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llamacpp_client-0.1.2.tar.gz (2.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llamacpp_client-0.1.2-py3-none-any.whl (3.1 kB view details)

Uploaded Python 3

File details

Details for the file llamacpp_client-0.1.2.tar.gz.

File metadata

  • Download URL: llamacpp_client-0.1.2.tar.gz
  • Upload date:
  • Size: 2.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for llamacpp_client-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a788336d087b56b4d0fab027d5fe31cfdf9b441bc82b85ee930fb54734c480fc
MD5 b653a1633e8c732050d4e4818629d18d
BLAKE2b-256 f2614012d8491d0ab52e9c029daa586ed6c02e6867cb77de67abeb981b35fe43

See more details on using hashes here.

File details

Details for the file llamacpp_client-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for llamacpp_client-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 dc5c1677d9fdb269cdfd963e2a4caa87e042c05707bccc992cbdd85fc2e6724e
MD5 7ca912ad1e1d7379a2fea8a148471a0a
BLAKE2b-256 75334949b90dfcf5e87ba22bcca01dc595a67ba51a126e88bbbcad34752dc18b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page