Skip to main content

Benchmark the performance (output speed, latency) of OpenAI compatible endpoints

Project description

LLM Performance Benchmark

Benchmark the performance of OpenAI compatible APIs in terms of Time to First Token (commonly referred to as latency) and Output Tokens per Second.

Background

Benchmarking script used by Artificial Analysis for our performance benchmarks.

For performance benchmarks across >150 endpoints, including across varied prompt lengths & parallel queries, visit: Artificial Analysis API Provider Peformance Leaderboard

Install

# Install from PyPI
pip install llm_performance_benchmark

Usage

from llm_performance_benchmark import llm_performance_benchmark

# Configure endpoint
import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
base_url = "https://api.openai.com/v1"
model_id = "gpt-4-turbo"

# Configure prompt
user_prompt = "What is the purpose of life?"
system_prompt = "You are a helpful assistant."

# Run benchmark
result = llm_performance_benchmark(model_id=model_id, 
                                   user_prompt=user_prompt, 
                                   system_prompt=system_prompt, 
                                   api_key=api_key, 
                                   base_url=base_url,
                                   print_response=True)
print(result)
""" Example response:
{
    'total_time': 8.9, 
    'time_to_first_token': 0.6, 
    'output_tokens_per_second': 39.4, 
    'tokens_per_second_across_total_request': 36.4,
    'response_text: "..."
}
"""

Definitions of output metrics

For definitions of what the metrics represent, see the following https://artificialanalysis.ai/methodology page. To standardize token counting across different models, text is tokenized as cl100k_base tokens (used by GPT-4) with Tiktoken.

License

MIT License

Copyright (c) 2024 Artificial Analysis, Inc.

See LICENSE file for further details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_performance_benchmark-1.0.1.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file llm_performance_benchmark-1.0.1.tar.gz.

File metadata

File hashes

Hashes for llm_performance_benchmark-1.0.1.tar.gz
Algorithm Hash digest
SHA256 ff114ef0786ccc030d6ec0f07f8c399cd3ada7e3dc749e201e54fcee65bc970b
MD5 076e11d14df1bac6d6390cc6b1cbe810
BLAKE2b-256 9489f0c6ee1a2b24bf120e690086d2ae369365ab260a0e8ba2a1b09b716ca9db

See more details on using hashes here.

File details

Details for the file llm_performance_benchmark-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_performance_benchmark-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 04be536cf7d7feeac13bccb34cfda421862ba3541ad83fdba6626809de9ae8e1
MD5 6de90a5acebe156b13cd25ed4248c97b
BLAKE2b-256 d1c913eafe337f7882be675add814f5fd3301950f97e6541c0a9fdfa7f0710e3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page