A benchmarking tool for AI models and Hardware.
Project description
robobench
|
๐ง EARLY DEVELOPMENT WARNING ๐ง
This tool is currently about as stable as a house of cards in a wind tunnel. Very early alpha. Bugs aren't just expected - they've signed a lease. Status: Proceed with optimism โ
|
A benchmarking tool for Local LLMs. Currently keeping an eye on Cortex.cpp but with plans to judge other frameworks equally in the future.
What is this?
robobench measures performance metrics, resource utilization, and stability characteristics of your LLM deployments. Rather comprehensive, really.
Features
- Model initialization metrics
- Runtime performance
- Resource utilization
- Advanced processing scenarios
- Workload-specific benchmarks
- System integration metrics
- Stability analysis
Installation
Using uvx:
uvx install robobench
Using pip:
pip install robobench
Usage
Basic Benchmarking
# Standard benchmark
robobench "llama3.2:3b-gguf-q2-k"
# With detailed metrics
robobench "llama3.2:3b-gguf-q2-k" --verbose
Specific Benchmarks
# Initialization only
robobench "llama3.2:3b-gguf-q2-k" --type init
# Runtime metrics
robobench "llama3.2:3b-gguf-q2-k" --type runtime
# Long-running stability test
robobench "llama3.2:3b-gguf-q2-k" --type stability --stability-duration 24
Advanced Usage
# Custom benchmark prompts
robobench "llama3.2:3b-gguf-q2-k" --type workload --prompts my_prompts.json
# Multi-model benchmarking
robobench "llama3.2:3b-gguf-q2-k" --type advanced \
--secondary-models "tinyllama:1b-gguf-q4" "phi2:3b-gguf-q4"
# Export results
robobench "llama3.2:3b-gguf-q2-k" --json results.json
Status
Under active development. Support for additional frameworks is planned.
Roadmap
- Framework-agnostic benchmarking
- Additional performance metrics
- Enhanced visualizations
- Extended stability testing
- local server and UI
- CI/CD management
Development
Setup
- Clone the repository:
git clone https://github.com/jan.ai/robobench.git
cd robobench
- Create and activate a virtual environment:
# Using uv (recommended)
uv venv .venv --python 3.12
source .venv/bin/activate
- Install development dependencies:
# Install project in editable mode with test dependencies
uv pip install -e ".[test]"
# Install development tools
uv add --dev ruff pytest pytest-cov pytest-asyncio hypothesis
Code Quality
Linting and Formatting
Run Ruff linter:
# Check code
ruff check .
# Auto-fix issues
ruff check --fix .
# Format code
ruff format .
# Check formatting without changes
ruff format --check .
Testing
Run tests:
# All tests
pytest
# With coverage
pytest --cov=robobench --cov-report=html
# Specific test file
pytest src/tests/test_utils.py
# With hypothesis verbose output
pytest -v src/tests/test_utils.py
Pre-commit Checks
Before submitting a PR:
# Format code
ruff format .
# Run linter
ruff check .
# Run tests with coverage
pytest --cov=robobench --cov-report=term-missing
# Show coverage report in browser (optional)
python -m http.server -d htmlcov
Code Style
The project uses:
- Type hints
- Some docstrings for public functions and classes
Project Structure
src/
โโโ robobench/
โ โโโ core/
โ โ โโโ initialization.py # Model initialization metrics
โ โ โโโ runtime.py # Runtime performance metrics
โ โ โโโ resources.py # Resource utilization metrics
โ โ โโโ integration.py # System integration metrics
โ โ โโโ workloads.py # Workload-specific metrics
โ โ โโโ stability.py # Stability metrics
โ โ โโโ utils.py # Shared utilities
โ โโโ cli.py # Command-line interface
โ โโโ __init__.py
โโโ tests/
โโโ conftest.py # Shared test fixtures
โโโ test_initialization.py
โโโ test_runtime.py
โโโ test_resources.py
โโโ test_integration.py
โโโ test_utils.py
Pre-commit Checks
Before submitting a PR:
- Run all tests
- Check test coverage
- Verify type hints with mypy (coming soon)
- Ensure docstrings are up to date
Contributing
Issues and pull requests welcome. Do have a look at the existing ones first, though.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file robobench-0.0.2.tar.gz.
File metadata
- Download URL: robobench-0.0.2.tar.gz
- Upload date:
- Size: 223.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.28
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc2f472dddf2f7ccfa75cce64fd12d7576d192ced9fde9b5f0aafa297695a80e
|
|
| MD5 |
dfba4875deeffc50ebf7c51db075cf55
|
|
| BLAKE2b-256 |
bd13a12a397439222c13d1837200f14e6d1ddec2274173d006426dadbc4a66c6
|
File details
Details for the file robobench-0.0.2-py3-none-any.whl.
File metadata
- Download URL: robobench-0.0.2-py3-none-any.whl
- Upload date:
- Size: 30.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.28
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5adb4cd5ecde38ad4c42fb5b5c1de2cef00384dc750148122e83a44db9931c1a
|
|
| MD5 |
a5902080f21189587d52ccb74f3350ee
|
|
| BLAKE2b-256 |
8666dbe4d68577bea0d66cf19707872787007f19564c1d19a7d22bbb05ef182b
|