Skip to main content

Client for LlamaCpp HTTP Server

Project description

llamacpp_client

llamacpp_client is a Python client library designed to simplify communication with the LlamaCpp HTTP server (as provided by the official Docker image). It provides both synchronous and asynchronous interfaces for interacting with LlamaCpp's completion and chat endpoints.

Features

  • Synchronous and Asynchronous API:
    • LlamaCppClient.completion: Synchronous API for blocking calls to /completion endpoint on LlamaCpp server.
    • LlamaCppClient.v1_chat_completions: Synchronous API for synchronous, blocking calls, to OpenAI /v1/chat/completions endpoint on LlamaCpp server (or any other OpenAI compatible server).
    • LlamaCppClient.async_completion: Asynchronous API for non-blocking, streaming responses to /completion endpoint on LlamaCpp server.
    • LlamaCppClient.async_v1_chat_completions: Asynchronous API for non-blocking, streaming responses to OpenAI /v1/chat/completions endpoint on LlamaCpp server (or any other OpenAI compatible server).
  • Multiple Endpoints Support: Load-balance requests across multiple LlamaCpp server endpoints.
  • Easy Configuration: Use LlamaCppServerAddress to specify server host and port.

Installation

pip install .

Or add to your pyproject.toml dependencies:

llamacpp_client = { path = "path/to/llamacpp_client" }

Usage

Synchronous example:

from llamacpp_client import LlamaCppClient, LlamaCppServerAddress

endpoints = [
    LlamaCppServerAddress(host="localhost", port=8080)
]
client = LlamaCppClient(endpoints)
response = client.completion(prompt="Hello, world!")
print(response)

Asynchronous example:

import asyncio
from llamacpp_client import LlamaCppClient, LlamaCppServerAddress

async def main():
    endpoints = [
        LlamaCppServerAddress(host="localhost", port=8080)
    ]
    client = LlamaCppClient(endpoints)
    async for chunk in client.async_completion(prompt="Hello, world!"):
        print(chunk.decode(), end="")

asyncio.run(main())

Requirements

License

MIT

-- Developed by Luís Gomes luismsgomes@gmail.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llamacpp_client-0.1.4.tar.gz (2.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llamacpp_client-0.1.4-py3-none-any.whl (3.2 kB view details)

Uploaded Python 3

File details

Details for the file llamacpp_client-0.1.4.tar.gz.

File metadata

  • Download URL: llamacpp_client-0.1.4.tar.gz
  • Upload date:
  • Size: 2.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for llamacpp_client-0.1.4.tar.gz
Algorithm Hash digest
SHA256 7c6f61c013fe264f49cd1a4ba4710d203e86f96fe3a2c7f1c3033727820c9108
MD5 da35d5e4f2cf516e443ef270e4524b52
BLAKE2b-256 a3b2fa997df867b3abb70c8835b3570a090595a0e35d419a7ee7c9488a65c417

See more details on using hashes here.

File details

Details for the file llamacpp_client-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for llamacpp_client-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 c677374e63b76935bfff496da3a08e0456d6a08ce191bb1a58b66cc0b475f548
MD5 e0e7ca1fb06d3158de483abb05c9c025
BLAKE2b-256 ececbcce568ea013e02932dbfba930d97af724ab7993fc5074de7f9299f8a7d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page