Benchmark the performance (output speed, latency) of OpenAI compatible endpoints
Project description
LLM Performance Benchmark
Benchmark the performance of OpenAI compatible APIs in terms of Time to First Token (commonly referred to as latency) and Output Tokens per Second.
Background
Benchmarking script used by Artificial Analysis for our performance benchmarks.
For performance benchmarks across >150 endpoints, including across varied prompt lengths & parallel queries, visit: Artificial Analysis API Provider Peformance Leaderboard
Install
# Install from PyPI
pip install llm_performance_benchmark
Usage
from llm_performance_benchmark import llm_performance_benchmark
# Configure endpoint
import os
from dotenv import load_dotenv
load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")
base_url = "https://api.openai.com/v1"
model_id = "gpt-4-turbo"
# Configure prompt
user_prompt = "What is the purpose of life?"
system_prompt = "You are a helpful assistant."
# Run benchmark
result = llm_performance_benchmark(model_id=model_id,
user_prompt=user_prompt,
system_prompt=system_prompt,
api_key=api_key,
base_url=base_url,
print_response=True)
print(result)
""" Example response:
{
'total_time': 8.9,
'time_to_first_token': 0.6,
'output_tokens_per_second': 39.4,
'tokens_per_second_across_total_request': 36.4,
'response_text: "..."
}
"""
Definitions of output metrics
For definitions of what the metrics represent, see the following https://artificialanalysis.ai/methodology page. To standardize token counting across different models, text is tokenized as cl100k_base tokens (used by GPT-4) with Tiktoken.
License
MIT License
Copyright (c) 2024 Artificial Analysis, Inc.
See LICENSE file for further details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file llm_performance_benchmark-1.0.1.tar.gz
.
File metadata
- Download URL: llm_performance_benchmark-1.0.1.tar.gz
- Upload date:
- Size: 4.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ff114ef0786ccc030d6ec0f07f8c399cd3ada7e3dc749e201e54fcee65bc970b |
|
MD5 | 076e11d14df1bac6d6390cc6b1cbe810 |
|
BLAKE2b-256 | 9489f0c6ee1a2b24bf120e690086d2ae369365ab260a0e8ba2a1b09b716ca9db |
File details
Details for the file llm_performance_benchmark-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: llm_performance_benchmark-1.0.1-py3-none-any.whl
- Upload date:
- Size: 4.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04be536cf7d7feeac13bccb34cfda421862ba3541ad83fdba6626809de9ae8e1 |
|
MD5 | 6de90a5acebe156b13cd25ed4248c97b |
|
BLAKE2b-256 | d1c913eafe337f7882be675add814f5fd3301950f97e6541c0a9fdfa7f0710e3 |