A benchmarking tool for AI models and Hardware.

These details have not been verified by PyPI

Project description

robobench

robobench Banner

🚧 EARLY DEVELOPMENT WARNING 🚧

This tool is currently about as stable as a house of cards in a wind tunnel.
Very early alpha. Bugs aren't just expected - they've signed a lease.

Status: Proceed with optimism ☕

A benchmarking tool for Local LLMs. Currently keeping an eye on Cortex.cpp but with plans to judge other frameworks equally in the future.

What is this?

robobench measures performance metrics, resource utilization, and stability characteristics of your LLM deployments. Rather comprehensive, really.

Features

Model initialization metrics
Runtime performance
Resource utilization
Advanced processing scenarios
Workload-specific benchmarks
System integration metrics
Stability analysis

Installation

Using uvx:

uvx install robobench

Using pip:

pip install robobench

Usage

Basic Benchmarking

# Standard benchmark
robobench "llama3.2:3b-gguf-q2-k"

# With detailed metrics
robobench "llama3.2:3b-gguf-q2-k" --verbose

Specific Benchmarks

# Initialization only
robobench "llama3.2:3b-gguf-q2-k" --type init

# Runtime metrics
robobench "llama3.2:3b-gguf-q2-k" --type runtime

# Long-running stability test
robobench "llama3.2:3b-gguf-q2-k" --type stability --stability-duration 24

Advanced Usage

# Custom benchmark prompts
robobench "llama3.2:3b-gguf-q2-k" --type workload --prompts my_prompts.json

# Multi-model benchmarking
robobench "llama3.2:3b-gguf-q2-k" --type advanced \
    --secondary-models "tinyllama:1b-gguf-q4" "phi2:3b-gguf-q4"

# Export results
robobench "llama3.2:3b-gguf-q2-k" --json results.json

Status

Under active development. Support for additional frameworks is planned.

Roadmap

Framework-agnostic benchmarking
Additional performance metrics
Enhanced visualizations
Extended stability testing
local server and UI
CI/CD management

Development

Setup

Clone the repository:

git clone https://github.com/jan.ai/robobench.git
cd robobench

Create and activate a virtual environment:

# Using uv (recommended)
uv venv .venv --python 3.12
source .venv/bin/activate

Install development dependencies:

# Install project in editable mode with test dependencies
uv pip install -e ".[test]"

# Install development tools
uv add --dev ruff pytest pytest-cov pytest-asyncio hypothesis

Code Quality

Linting and Formatting

Run Ruff linter:

# Check code
ruff check .

# Auto-fix issues
ruff check --fix .

# Format code
ruff format .

# Check formatting without changes
ruff format --check .

Testing

Run tests:

# All tests
pytest

# With coverage
pytest --cov=robobench --cov-report=html

# Specific test file
pytest src/tests/test_utils.py

# With hypothesis verbose output
pytest -v src/tests/test_utils.py

Pre-commit Checks

Before submitting a PR:

# Format code
ruff format .

# Run linter
ruff check .

# Run tests with coverage
pytest --cov=robobench --cov-report=term-missing

# Show coverage report in browser (optional)
python -m http.server -d htmlcov

Code Style

The project uses:

Type hints
Some docstrings for public functions and classes

Project Structure

src/
├── robobench/
│   ├── core/
│   │   ├── initialization.py   # Model initialization metrics
│   │   ├── runtime.py         # Runtime performance metrics
│   │   ├── resources.py       # Resource utilization metrics
│   │   ├── integration.py     # System integration metrics
│   │   ├── workloads.py      # Workload-specific metrics
│   │   ├── stability.py       # Stability metrics
│   │   └── utils.py          # Shared utilities
│   ├── cli.py                # Command-line interface
│   └── __init__.py
└── tests/
    ├── conftest.py           # Shared test fixtures
    ├── test_initialization.py
    ├── test_runtime.py
    ├── test_resources.py
    ├── test_integration.py
    └── test_utils.py

Pre-commit Checks

Before submitting a PR:

Run all tests
Check test coverage
Verify type hints with mypy (coming soon)
Ensure docstrings are up to date

Contributing

Issues and pull requests welcome. Do have a look at the existing ones first, though.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.0.2

Feb 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

robobench-0.0.2.tar.gz (223.6 kB view details)

Uploaded Feb 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

robobench-0.0.2-py3-none-any.whl (30.6 kB view details)

Uploaded Feb 9, 2025 Python 3

File details

Details for the file robobench-0.0.2.tar.gz.

File metadata

Download URL: robobench-0.0.2.tar.gz
Upload date: Feb 9, 2025
Size: 223.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.4.28

File hashes

Hashes for robobench-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`fc2f472dddf2f7ccfa75cce64fd12d7576d192ced9fde9b5f0aafa297695a80e`
MD5	`dfba4875deeffc50ebf7c51db075cf55`
BLAKE2b-256	`bd13a12a397439222c13d1837200f14e6d1ddec2274173d006426dadbc4a66c6`

See more details on using hashes here.

File details

Details for the file robobench-0.0.2-py3-none-any.whl.

File metadata

Download URL: robobench-0.0.2-py3-none-any.whl
Upload date: Feb 9, 2025
Size: 30.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.4.28

File hashes

Hashes for robobench-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5adb4cd5ecde38ad4c42fb5b5c1de2cef00384dc750148122e83a44db9931c1a`
MD5	`a5902080f21189587d52ccb74f3350ee`
BLAKE2b-256	`8666dbe4d68577bea0d66cf19707872787007f19564c1d19a7d22bbb05ef182b`

See more details on using hashes here.

robobench 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

robobench

What is this?

Features

Installation

Usage

Basic Benchmarking

Specific Benchmarks

Advanced Usage

Status

Roadmap

Development

Setup

Code Quality

Linting and Formatting

Testing

Pre-commit Checks

Code Style

Project Structure

Pre-commit Checks

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes