Skip to main content

LLM Inference Benchmarking Tool

Project description

EchoSwift: LLM Inference Benchmarking Tool

EchoSwift is a powerful and flexible tool designed for benchmarking Large Language Model (LLM) inference. It allows users to measure and analyze the performance of LLM endpoints across various metrics, including latency, throughput, and time to first token (TTFT).

EchoSwift

Features

  • Benchmark LLM inference across multiple providers (e.g., Ollama, vLLM, TGI)
  • Measure key performance metrics: latency, throughput, and TTFT
  • Support for varying input and output token lengths
  • Simulate concurrent users to test scalability
  • Easy-to-use CLI interface
  • Detailed logging and progress tracking

Performance metrics:

The performance metrics captured for varying input and output tokens and parallel users while running the benchmark includes

  • Latency (ms/token)
  • TTFT(ms)
  • Throughput(tokens/sec)

metrics

Installation

You can install EchoSwift using pip:

pip install echoswift

Alternatively, you can install from source:

git clone --branch akhil https://github.com/Infobellit-Solutions-Pvt-Ltd/EchoSwift.git
cd EchoSwift
pip install -e .

Requirements

  • Python 3.10+
  • Dependencies listed in requirements.txt

Usage

EchoSwift provides a simple CLI interface for running benchmarks. Here are the main commands:

1. Download and Filter Dataset

Before running a benchmark, you need to download and filter the dataset:

echoswift dataprep

This command will download the ShareGPT dataset and filter it based on various input token lengths.

2. Configure the Benchmark

Create or modify the config.yaml file in the project root directory. Here's an example configuration:

out_dir: "results"
base_url: "http://localhost:11434/api/generate"
provider: "Ollama"
model: "llama2" # Model is required for Ollama and vLLM
max_requests: 5
user_counts: [1, 3, 10]
input_tokens: [32]
output_tokens: [256]

Adjust these parameters according to your needs and the LLM endpoint you're benchmarking.

3. Run the Benchmark

To start the benchmark using the configuration from config.yaml:

echoswift start

If you want to use a different configuration file:

echoswift start --config path/to/your/config.yaml

Output

EchoSwift will create a results directory (or the directory specified in out_dir) containing:

  • CSV files with raw benchmark data
  • Averaged results for each combination of users, input tokens, and output tokens
  • Log files for each Locust run

Analyzing Results

After the benchmark completes, you can find detailed CSV files in the output directory. These files contain information about latency, throughput, and TTFT for each test configuration.

Citation

If you find our resource useful, please cite our paper:

EchoSwift: An Inference Benchmarking and Configuration Discovery Tool for Large Language Models (LLMs)

@inproceedings{Krishna2024,
  series = {ICPE '24},
  title = {EchoSwift: An Inference Benchmarking and Configuration Discovery Tool for Large Language Models (LLMs)},
  url = {https://dl.acm.org/doi/10.1145/3629527.3652273},
  DOI = {10.1145/3629527.3652273},
  booktitle = {Companion of the 15th ACM/SPEC International Conference on Performance Engineering},
  publisher = {ACM},
  author = {Krishna, Karthik and Bandili, Ramana},
  year = {2024},
  month = May,
  collection = {ICPE '24}
}

Support

If you encounter any issues or have questions, please open an issue on our GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

echoswift-0.1.0.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

echoswift-0.1.0-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file echoswift-0.1.0.tar.gz.

File metadata

  • Download URL: echoswift-0.1.0.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for echoswift-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4a57630f269eccc4f41e8e0d43eb5b4770f676267af27ae2254f1b002cafe434
MD5 9c34a3163b70fde0e62814cd8f39571e
BLAKE2b-256 0da72706a82bd3fb3e1c1c1d78c33a3afd28233485e0c4aff794fbebedc9b821

See more details on using hashes here.

File details

Details for the file echoswift-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: echoswift-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for echoswift-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0d7a3af8d0d93456faf1f57185676292ebe5a2debca40e43f211df0900fef42d
MD5 c74aa8d47b09e2cadef0052d5684ddf9
BLAKE2b-256 a0bffea54edd71a93e49ddc6ad0411361d68f65d0ed1133b0be52dd6955d0baa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page