RUNE — Reliability Use-case Numeric Evaluator
Project description
rune
A collection of benchmarks, evaluation scripts, and reproducible test suites for comparing AI models, LLMs, and inference frameworks.
Setup & Provisioning
rune includes RUNE — Reliability Use-case Numeric Evaluator.
RUNE orchestrates benchmarkable DevOps/SRE operations, with optional Vast.ai provisioning for Ollama and agentic investigation via HolmesGPT.
Repository Layout
rune/
├── rune/
│ ├── __init__.py # Thin Typer CLI (commands, prompts, Rich output)
│ ├── __main__.py # Package entrypoint (python -m rune)
│ └── api.py # API server entrypoint (python -m rune.api)
├── provision.py # CLI shim forwarding to rune package
├── rune_bench/
│ ├── __init__.py
│ ├── workflows.py # Reusable orchestration workflows (no Typer/Rich)
│ ├── vastai/
│ │ ├── offer.py # OfferFinder
│ │ ├── template.py # TemplateLoader
│ │ ├── instance.py # InstanceManager + ConnectionDetails
│ │ └── __init__.py
│ ├── common/
│ │ ├── models.py # ModelSelector + MODELS
│ │ └── __init__.py
│ ├── agents/
│ │ ├── holmes.py # HolmesRunner
│ │ └── __init__.py
│ └── ollama/ # NEW: Modular Ollama integration
│ ├── client.py # OllamaClient (HTTP transport)
│ ├── models.py # OllamaModelManager (business logic)
│ └── __init__.py
├── experiments/
│ └── provision.py
├── requirements.txt
└── Dockerfile
Platform documentation now lives in the dedicated lpasquali/rune-docs repository.
Helm chart packaging and deployment assets now live in the dedicated lpasquali/rune-charts repository.
Kubernetes operator orchestration now lives in the dedicated lpasquali/rune-operator repository.
RUNE Commands
python -m rune provides five commands:
run-ollama-instance:--vastaienabled runs the Vast.ai provisioning workflow; without--vastai, use--ollama-urlexisting server mode.run-agentic-agent: run HolmesGPT-only analysis against Kubernetes.run-benchmark: phase 1 selects an Ollama source (Vast.ai provisioning or existing server), then phase 2 runs HolmesGPT analysis.vastai-list-models: print the configured model catalog used for Vast.ai auto-selection.ollama-list-models: list the models currently exposed by an existing Ollama server URL.
CLI Options Summary
Backend selection
--backend local|http(orRUNE_BACKENDenv var)--api-base-url http://host:port(orRUNE_API_BASE_URLenv var)--api-token ...(orRUNE_API_TOKENenv var)--api-tenant ...(orRUNE_API_TENANTenv var)--idempotency-key ...on async HTTP job commands
Default mode is local, preserving the current in-process CLI behavior.
In http mode, the following commands can query/execute against a remote RUNE API:
vastai-list-modelsollama-list-modelsrun-ollama-instance(job submit/poll)run-agentic-agent(job submit/poll)run-benchmark(job submit/poll)
API server mode
Run the in-repo server with persistent SQLite-backed jobs:
export RUNE_API_TOKENS='default:dev-token'
export RUNE_API_DB_PATH=.rune-api/jobs.db
python -m rune.api
Development-only unauthenticated mode is also available:
export RUNE_API_AUTH_DISABLED=1
python -m rune.api
Server-side controls:
- persistent async jobs in SQLite
- tenant-scoped job lookup via
X-Tenant-ID - token auth via
Authorization: Bearer ...orX-API-Key - idempotent POST job creation via
Idempotency-Key
Shared agent options
--question,-q--model,-m(used byrun-agentic-agent, and byrun-benchmarkwhen--vastaiis disabled)--ollama-warmup,--no-ollama-warmup--ollama-warmup-timeout--kubeconfig
Vast.ai options (enabled only when --vastai is set)
--vastai--vastai-template--vastai-min-dph--vastai-max-dph--vastai-reliability
Use vastai-list-models to inspect the configured Vast.ai model shortlist.
Existing server mode
--ollama-url(required when--vastaiis not enabled)
Use ollama-list-models --ollama-url ... to inspect the exact model names exposed by your existing server.
Running RUNE
Option A: Docker
# Build image
docker build -t ai-benchmark-rune .
# Existing server mode (default)
docker run -it --rm \
ai-benchmark-rune run-ollama-instance \
--ollama-url http://host.docker.internal:11434
# Vast.ai mode
docker run -it --rm \
-v ~/.vast_api_key:/root/.vast_api_key \
ai-benchmark-rune run-ollama-instance \
--vastai
# Agent-only mode
docker run -it --rm \
-v ~/.kube:/root/.kube \
ai-benchmark-rune run-agentic-agent \
--question "What is unhealthy?"
# Full benchmark with Vast.ai phase 1
docker run -it --rm \
-v ~/.vast_api_key:/root/.vast_api_key \
-v ~/.kube:/root/.kube \
ai-benchmark-rune run-benchmark \
--vastai \
--question "Why is the cluster degraded?"
Option B: Local
pip install -r requirements.txt
# Existing server mode
python -m rune run-ollama-instance --ollama-url http://localhost:11434
# Vast.ai mode
python -m rune run-ollama-instance --vastai
# Show the configured Vast.ai model shortlist
python -m rune vastai-list-models
# Show models exposed by an existing Ollama server
python -m rune ollama-list-models --ollama-url http://localhost:11434
# Agent-only mode
python -m rune run-agentic-agent --question "What is unhealthy?"
# Full benchmark (existing server phase 1)
python -m rune run-benchmark --ollama-url http://localhost:11434 --model llama3.1:8b
# Full benchmark without pre-loading the Ollama model
python -m rune run-benchmark --ollama-url http://localhost:11434 --model llama3.1:8b --no-ollama-warmup
# Full benchmark (Vast.ai phase 1)
python -m rune run-benchmark --vastai --question "What is unhealthy?"
Testing
Automated tests (safe/offline)
Automated tests are designed to run anywhere without creating cloud resources. They mock Ollama and Vast.ai boundaries.
pip install -r requirements.txt
python -m pytest -q
Coverage is enforced at a minimum of 97% via pytest configuration.
Coverage table columns mean:
Stmts: executable Python statements in the fileMiss: statements not executed by testsCover: percentage covered ((Stmts - Miss) / Stmts)Missing: uncovered line numbers/ranges (for example144-146means lines 144, 145, 146)
For a more graphical report, open the generated HTML output at:
htmlcov/index.html
Manual tests (cost-incurring)
Vast.ai instance creation/destruction paths should be validated manually, because they can incur real costs.
Example manual run:
python -m rune run-benchmark --vastai --question "What is unhealthy?"
Contributing
See CONTRIBUTING.md.
Security
See SECURITY.md. See compliance targets in rune-docs for the repository's explicit security and compliance targets.
License
GNU General Public License v3.0. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rune_bench-0.0.0a1.tar.gz.
File metadata
- Download URL: rune_bench-0.0.0a1.tar.gz
- Upload date:
- Size: 119.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7f4167f4ca14b2be42f03a98db3c17c9ff0d676d16883955983c7c16d4e4a07
|
|
| MD5 |
1a285d1eeaa9050371fe9988acd7f69d
|
|
| BLAKE2b-256 |
5b3058f82c14eba2d3ccf719561e0ceab502a2cfbf0d39e01fed6e574b075868
|
Provenance
The following attestation bundles were made for rune_bench-0.0.0a1.tar.gz:
Publisher:
publish-pypi.yml on lpasquali/rune
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rune_bench-0.0.0a1.tar.gz -
Subject digest:
a7f4167f4ca14b2be42f03a98db3c17c9ff0d676d16883955983c7c16d4e4a07 - Sigstore transparency entry: 1228689168
- Sigstore integration time:
-
Permalink:
lpasquali/rune@bf95621f0e68d50f1c0ae80fb76704c663a0b20f -
Branch / Tag:
refs/tags/v0.0.0a1 - Owner: https://github.com/lpasquali
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@bf95621f0e68d50f1c0ae80fb76704c663a0b20f -
Trigger Event:
push
-
Statement type:
File details
Details for the file rune_bench-0.0.0a1-py3-none-any.whl.
File metadata
- Download URL: rune_bench-0.0.0a1-py3-none-any.whl
- Upload date:
- Size: 101.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37b3d3a870e77251226ffb2a6b3ca28951035f9b7e9b690009b6225ebb40385d
|
|
| MD5 |
6aa36dce2077690b489301198a64526f
|
|
| BLAKE2b-256 |
05d22b5afbef802b923ca5ff349332b88656fa11edc537ac0b44e69fb07edda6
|
Provenance
The following attestation bundles were made for rune_bench-0.0.0a1-py3-none-any.whl:
Publisher:
publish-pypi.yml on lpasquali/rune
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rune_bench-0.0.0a1-py3-none-any.whl -
Subject digest:
37b3d3a870e77251226ffb2a6b3ca28951035f9b7e9b690009b6225ebb40385d - Sigstore transparency entry: 1228689203
- Sigstore integration time:
-
Permalink:
lpasquali/rune@bf95621f0e68d50f1c0ae80fb76704c663a0b20f -
Branch / Tag:
refs/tags/v0.0.0a1 - Owner: https://github.com/lpasquali
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@bf95621f0e68d50f1c0ae80fb76704c663a0b20f -
Trigger Event:
push
-
Statement type: