Skip to main content

LLM Benchmark

Project description

llm-benchmark (ollama-benchmark)

LLM Benchmark for Throughput via Ollama (Local LLMs)

Measure how fast your local LLMs really are—with a simple, cross-platform CLI tool that tells you the tokens-per-second truth.

Installation prerequisites

Working Ollama installation.

To create a virtual environment via python3 -m venv

python3 -m venv .venv
## On Linux and macOS
source .venv/bin/activate
## On Windows Powershell or Cmd
.\.venv\Scripts\activate

To create a virtual environment via uv (For uv virtual environments (recommended for Python 3.13))

uv venv .venv --python 3.13
## On Linux and macOS
source .venv/bin/activate
## On Windows Powershell or Cmd
.\.venv\Scripts\activate

Installation Steps

Depending on your python setup either

pip install llm-benchmark

or

pipx install llm-benchmark

or uv

uv pip install llm-benchmark

Usage for general users directly

llm_benchmark run

Installation and Usage in Video format

llm-benchmark

It's tested on Python 3.10 and above.

ollama installation with the following models installed

7B model can be run on machines with 8GB of RAM

13B model can be run on machines with 16GB of RAM

Usage explaination

On Windows, Linux, and macOS, it will detect memory RAM size to first download required LLM models.

When memory RAM size is greater than or equal to 4GB, but less than 7GB, it will check if gemma:2b exist. The program implicitly pull the model.

ollama pull deepseek-r1:1.5b
ollama pull gemma:2b
ollama pull phi:2.7b
ollama pull phi3:3.8b

When memory RAM size is greater than 7GB, but less than 15GB, it will check if these models exist. The program implicitly pull these models

ollama pull phi3:3.8b
ollama pull gemma2:9b
ollama pull mistral:7b
ollama pull llama3.1:8b
ollama pull deepseek-r1:8b
ollama pull llava:7b

When memory RAM size is greater than 15GB, but less than 31GB, it will check if these models exist. The program implicitly pull these models

ollama pull gemma2:9b
ollama pull mistral:7b
ollama pull phi4:14b
ollama pull deepseek-r1:8b
ollama pull deepseek-r1:14b
ollama pull llava:7b
ollama pull llava:13b

When memory RAM size is greater than 31GB, it will check if these models exist. The program implicitly pull these models

ollama pull phi4:14b
ollama pull deepseek-r1:14b
ollama pull gpt-oss:20b

Python Poetry manually(advanced) installation

https://python-poetry.org/docs/#installing-manually

For developers to develop new features on Windows Powershell or on Ubuntu Linux or macOS

python3 -m venv .venv
. ./.venv/bin/activate
pip install -U pip setuptools
pip install poetry

Usage in Python virtual environment

poetry shell
poetry install
llm_benchmark hello jason

Example #1 send systeminfo and benchmark results to a remote server

llm_benchmark run

Example #2 Do not send systeminfo and benchmark results to a remote server

llm_benchmark run --no-sendinfo

Example #3 Benchmark run on explicitly given the path to the ollama executable (When you built your own developer version of ollama)

llm_benchmark run --ollamabin=~/code/ollama/ollama

Example #4 run custom benchmark models

  1. Create a custom benchmark file like following yaml format, replace with your own benchmark models, remember to use double quote for your model name
file_name: "custombenchmarkmodels.yml"
version: 2.0.custom
models:
  - model: "deepseek-r1:1.5b"
  - model: "qwen:0.5b"
  1. run with the flag and point to the path of custombenchmarkmodels.yml
llm_benchmark run --custombenchmark=path/to/custombenchmarkmodels.yml

Reference

Ollama

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_benchmark-0.5.2.tar.gz (2.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_benchmark-0.5.2-py3-none-any.whl (2.1 MB view details)

Uploaded Python 3

File details

Details for the file llm_benchmark-0.5.2.tar.gz.

File metadata

  • Download URL: llm_benchmark-0.5.2.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llm_benchmark-0.5.2.tar.gz
Algorithm Hash digest
SHA256 2455e359069ce4f98d2c2a6d9764d2c9bb0d926925d4ef62e24f9ab490acd325
MD5 274d257cdb470d218b66c721f227d318
BLAKE2b-256 5eedb463c3f2bdf7b01108e08c5acd738f1b97c4395d7ea23bbd9d02b18fef8a

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_benchmark-0.5.2.tar.gz:

Publisher: python-publish.yml on aidatatools/ollama-benchmark

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llm_benchmark-0.5.2-py3-none-any.whl.

File metadata

  • Download URL: llm_benchmark-0.5.2-py3-none-any.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llm_benchmark-0.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5183071df2faaf5aa09061716c92830e05c76db5de377ab01f63007860573085
MD5 90a827f86292b3bf00abde4ae148f5fc
BLAKE2b-256 a26422c93afbd9fde2f78574dc9efe247c20f64c8af1c424bd76c8574147bf3f

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_benchmark-0.5.2-py3-none-any.whl:

Publisher: python-publish.yml on aidatatools/ollama-benchmark

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page