Skip to main content

A powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.

Project description


Logo

Unified, accurate, and beautiful LLM Benchmarking

PyPI version Python versions Types - Mypy Coverage - coverage License

| User Guide | Contribution Guideline |

UI

Introduction

Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.

It provides detailed insights into model serving performance, offering both a user-friendly CLI and a live UI for real-time progress monitoring.

Features

  • 🛠️ CLI Tool: Validates user inputs and initiates benchmarks seamlessly.
  • 📊 Live UI Dashboard: Displays current progress, logs, and real-time metrics.
  • 📝 Rich Logs: Automatically flushed to both terminal and file upon experiment completion.
  • 📈 Experiment Analyzer: Generates comprehensive Excel reports with pricing and raw metrics data, plus flexible plot configurations (default 2x4 grid) that visualize key performance metrics including throughput, latency (TTFT, E2E, TPOT), error rates, and RPS across different traffic scenarios and concurrency levels. Supports custom plot layouts and multi-line comparisons.

Installation

Quick Start: Install with pip install genai-bench. Alternatively, check Installation Guide for other options.

How to use

Quick Start

  1. Run a benchmark against your model:

    # Text generation (chat completions)
    genai-bench benchmark --api-backend "your-backend" \
      --api-base "http://localhost:8080" \
      --api-key "your-api-key" \
      --api-model-name "your-model" \
      --task text-to-text \
      --max-time-per-run 5 \
      --max-requests-per-run 100
    
    # Image generation (OpenAI-compatible /v1/images/generations)
    genai-bench benchmark --api-backend openai \
      --api-base "http://localhost:8080" \
      --api-key "your-api-key" \
      --api-model-name "your-model" \
      --model-tokenizer "gpt2" \
      --task text-to-image \
      --traffic-scenario "I(1024,1024)" \
      --max-time-per-run 60 \
      --max-requests-per-run 10 \
      --dataset-path /path/to/image_prompts.txt
    
  2. Generate Excel reports from your results:

    genai-bench excel --experiment-folder ./experiments/your_experiment \
      --excel-name results --metric-percentile mean
    
  3. Create visualizations:

    genai-bench plot --experiments-folder ./experiments \
      --group-key traffic_scenario --preset 2x4_default
    

Next Steps

If you're new to GenAI Bench, check out the Getting Started page.

For detailed instructions, advanced configuration options, and comprehensive examples, check out the User Guide.

Development

If you are interested in contributing to GenAI-Bench, you can use the Development Guide.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genai_bench-0.0.4.tar.gz (120.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genai_bench-0.0.4-py3-none-any.whl (174.9 kB view details)

Uploaded Python 3

File details

Details for the file genai_bench-0.0.4.tar.gz.

File metadata

  • Download URL: genai_bench-0.0.4.tar.gz
  • Upload date:
  • Size: 120.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for genai_bench-0.0.4.tar.gz
Algorithm Hash digest
SHA256 e83b1872e57a73204618758712527aab60a3d5fce7362df5caf33ebe601b7200
MD5 60785d20c81c33a201cdbc47c150d00c
BLAKE2b-256 254072bfe2d465ac7df3f9e405c9b9e93087c74c001a4edb3d71333fb260ec04

See more details on using hashes here.

File details

Details for the file genai_bench-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: genai_bench-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 174.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for genai_bench-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 bd99c43707cc3d1fbf5ac4e957b584a1250ff95da4559b56017a28a7c5102c92
MD5 d4764c58c7f3da72da1d4af46bfe0b41
BLAKE2b-256 baffbb193f3449d89c1caf7ad31ef3651e967257d56fb7506020833ad1e1812e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page