Skip to main content

Benchmark the performance (output speed, latency) of OpenAI compatible endpoints

Project description

LLM Performance Benchmark

Benchmark the performance of OpenAI compatible APIs in terms of Time to First Token (commonly referred to as latency) and Output Tokens per Second.

Background

Benchmarking script used by Artificial Analysis for our performance benchmarks.

For performance benchmarks across >150 endpoints, including across varied prompt lengths & parallel queries, visit: Artificial Analysis API Provider Peformance Leaderboard

Install

# Install from PyPI
pip install llm_performance_benchmark

Usage

from llm_performance_benchmark import llm_performance_benchmark

# Configure endpoint
import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
base_url = "https://api.openai.com/v1"
model_id = "gpt-4-turbo"

# Configure prompt
user_prompt = "What is the purpose of life?"
system_prompt = "You are a helpful assistant."

# Run benchmark
result = llm_performance_benchmark(model_id=model_id, 
                                   user_prompt=user_prompt, 
                                   system_prompt=system_prompt, 
                                   api_key=api_key, 
                                   base_url=base_url,
                                   print_response=True)
print(result)
""" Example response:
{
    'total_time': 8.9, 
    'time_to_first_token': 0.6, 
    'output_tokens_per_second': 39.4, 
    'tokens_per_second_across_total_request': 36.4,
    'response_text: "..."
}
"""

Definitions of output metrics

For definitions of what the metrics represent, see the following https://artificialanalysis.ai/methodology page. To standardize token counting across different models, text is tokenized as cl100k_base tokens (used by GPT-4) with Tiktoken.

License

MIT License

Copyright (c) 2024 Artificial Analysis, Inc.

See LICENSE file for further details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_performance_benchmark-1.0.tar.gz (4.2 kB view details)

Uploaded Source

Built Distributions

llm_performance_benchmark-1.0-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

LLM_Performance_Benchmark-1.0-py3-none-any.whl (4.8 kB view details)

Uploaded Python 3

File details

Details for the file llm_performance_benchmark-1.0.tar.gz.

File metadata

File hashes

Hashes for llm_performance_benchmark-1.0.tar.gz
Algorithm Hash digest
SHA256 ff1bff5ed4267e90226597c5b30632e111584975da1db7ed2b39206deba70e26
MD5 a0bb3dbe7f144a5c234af68ab5a64f93
BLAKE2b-256 c20a7d8f0aa54fadf007854c23303c7809bf9e84f00d3743e2595c48a3767c08

See more details on using hashes here.

File details

Details for the file llm_performance_benchmark-1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_performance_benchmark-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cf1694c1e2c0a3fe7e966233eb97906b173815ea089b53eb4b9c0c9db2dfe64d
MD5 8e594cb38a5a08104355179194f9c6d4
BLAKE2b-256 3cdaf532e8a76734e5081206a2cfc1c45442490696e318f27a9240f82dfcb84a

See more details on using hashes here.

File details

Details for the file LLM_Performance_Benchmark-1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for LLM_Performance_Benchmark-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dfc7e7a49157f241877283886df2a451f10586ee755bdfc23b4a62467f65eec1
MD5 ae78a4d20c02330598a59da2512adabe
BLAKE2b-256 53c0b1220047e616ffcb18162fa01bf58447511d4da20489f1e7da6f2317c0cd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page