Skip to main content

A client for interacting with LLM APIs.

Project description

llm-api-client :robot::zap:

Build & Tests status PyPI version PyPI - License Python compatibility

A Python helper library for efficiently managing concurrent, rate-limited API requests, especially for Large Language Models (LLMs) via LiteLLM.

It provides an APIClient that handles:

  • Concurrency: Making multiple API calls simultaneously using threads.
  • Rate Limiting: Respecting API limits for requests per minute (RPM) and tokens per minute (TPM).
  • Retries: Automatically retrying failed requests.
  • Request Sanitization: Cleaning up request parameters to ensure compatibility with different models/providers.
  • Context Management: Truncating message history to fit within model context windows.
  • Usage Tracking: Monitoring API costs, token counts, and response times via an integrated APIUsageTracker.

Installation

Install the package directly from PyPI:

pip install llm-api-client

Usage

Here's a basic example of using APIClient to make multiple completion requests concurrently:

import os
from llm_api_client import APIClient

# Ensure your API key is set (e.g., OPENAI_API_KEY environment variable)
# os.environ["OPENAI_API_KEY"] = "your-api-key"

# Create a client with specific rate limits (adjust as needed)
# Defaults use OpenAI Tier 4 limits if not specified.
client = APIClient(
    max_requests_per_minute=1000,
    max_tokens_per_minute=100000
)

# Prepare your API requests
prompts = [
    "Explain the theory of relativity in simple terms.",
    "Write a short poem about a cat.",
    "What is the capital of France?",
]

requests_data = [
    {
        "model": "gpt-3.5-turbo",
        "messages": [{"role": "user", "content": prompt}],
        # Add other parameters like temperature, max_tokens etc. if needed
        # "temperature": 0.7,
        # "max_tokens": 150,
    }
    for prompt in prompts
]

# Make the requests concurrently
# Use make_requests_with_retries for built-in retry logic
responses = client.make_requests(requests_data)

# Process the responses
for i, response in enumerate(responses):
    if response:
        # Access response content (structure depends on the API/model)
        # For OpenAI/LiteLLM completion:
        try:
            message_content = response.choices[0].message.content
            print(f"Response {i+1}: {message_content[:100]}...") # Print first 100 chars
        except (AttributeError, IndexError, TypeError) as e:
            print(f"Response {i+1}: Could not parse response content. Error: {e}")
            print(f"Raw response: {response}")
    else:
        print(f"Response {i+1}: Request failed.")

# Access usage statistics
print("\n--- Usage Statistics ---")
print(client.tracker) # Prints detailed stats

# Or access specific stats
print(f"Total cost: ${client.tracker.total_cost:.4f}")
print(f"Total prompt tokens: {client.tracker.total_prompt_tokens}")
print(f"Total completion tokens: {client.tracker.total_completion_tokens}")
print(f"Number of successful API calls: {client.tracker.num_api_calls}")
print(f"Mean response time: {client.tracker.mean_response_time:.2f}s")

# View request/response history
# print("\n--- History ---")
# for entry in client.history:
#     print(entry)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_api_client-0.1.0.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_api_client-0.1.0-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file llm_api_client-0.1.0.tar.gz.

File metadata

  • Download URL: llm_api_client-0.1.0.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llm_api_client-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d5a442b38a2ccf344efe33aa1a9fa7581c21b65b307010dd88f330d440a94b8c
MD5 369dddcde5810aaae2b1295348c3c03f
BLAKE2b-256 a294fc3d752b5c33c91ca9ee9c0aba59ecbbe58e72aa5c9ff9c16f2932b6dc9f

See more details on using hashes here.

File details

Details for the file llm_api_client-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llm_api_client-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llm_api_client-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e895a208769a07388d9bb10009764317bf75a8f1b3e5ff0c05c6c93ec0eb5a88
MD5 224a0cb1ef1b831d18664fb4c90cb3e2
BLAKE2b-256 4ae94025d6b89ce26079d6c6a2c599c5f5a234b42b1590e5f64c1c313eb857c2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page