Skip to main content

RUNE — Reliability Use-case Numeric Evaluator

Project description

rune

A collection of benchmarks, evaluation scripts, and reproducible test suites for comparing AI models, LLMs, and inference frameworks.

Setup & Provisioning

rune includes RUNE — Reliability Use-case Numeric Evaluator.

RUNE orchestrates benchmarkable DevOps/SRE operations, with optional Vast.ai provisioning for Ollama and agentic investigation via HolmesGPT.

Repository Layout

rune/
├── rune/
│   ├── __init__.py          # Thin Typer CLI (commands, prompts, Rich output)
│   ├── __main__.py          # Package entrypoint (python -m rune)
│   └── api.py               # API server entrypoint (python -m rune.api)
├── provision.py             # CLI shim forwarding to rune package
├── rune_bench/
│   ├── __init__.py
│   ├── workflows.py         # Reusable orchestration workflows (no Typer/Rich)
│   ├── vastai/
│   │   ├── offer.py         # OfferFinder
│   │   ├── template.py      # TemplateLoader
│   │   ├── instance.py      # InstanceManager + ConnectionDetails
│   │   └── __init__.py
│   ├── common/
│   │   ├── models.py        # ModelSelector + MODELS
│   │   └── __init__.py
│   ├── agents/
│   │   ├── holmes.py        # HolmesRunner
│   │   └── __init__.py
│   └── ollama/              # NEW: Modular Ollama integration
│       ├── client.py        # OllamaClient (HTTP transport)
│       ├── models.py        # OllamaModelManager (business logic)
│       └── __init__.py
├── experiments/
│   └── provision.py
├── requirements.txt
└── Dockerfile

Platform documentation now lives in the dedicated lpasquali/rune-docs repository. Helm chart packaging and deployment assets now live in the dedicated lpasquali/rune-charts repository. Kubernetes operator orchestration now lives in the dedicated lpasquali/rune-operator repository.

RUNE Commands

python -m rune provides five commands:

  • run-ollama-instance: --vastai enabled runs the Vast.ai provisioning workflow; without --vastai, use --ollama-url existing server mode.
  • run-agentic-agent: run HolmesGPT-only analysis against Kubernetes.
  • run-benchmark: phase 1 selects an Ollama source (Vast.ai provisioning or existing server), then phase 2 runs HolmesGPT analysis.
  • vastai-list-models: print the configured model catalog used for Vast.ai auto-selection.
  • ollama-list-models: list the models currently exposed by an existing Ollama server URL.

CLI Options Summary

Backend selection

  • --backend local|http (or RUNE_BACKEND env var)
  • --api-base-url http://host:port (or RUNE_API_BASE_URL env var)
  • --api-token ... (or RUNE_API_TOKEN env var)
  • --api-tenant ... (or RUNE_API_TENANT env var)
  • --idempotency-key ... on async HTTP job commands

Default mode is local, preserving the current in-process CLI behavior. In http mode, the following commands can query/execute against a remote RUNE API:

  • vastai-list-models
  • ollama-list-models
  • run-ollama-instance (job submit/poll)
  • run-agentic-agent (job submit/poll)
  • run-benchmark (job submit/poll)

API server mode

Run the in-repo server with persistent SQLite-backed jobs:

export RUNE_API_TOKENS='default:dev-token'
export RUNE_API_DB_PATH=.rune-api/jobs.db
python -m rune.api

Development-only unauthenticated mode is also available:

export RUNE_API_AUTH_DISABLED=1
python -m rune.api

Server-side controls:

  • persistent async jobs in SQLite
  • tenant-scoped job lookup via X-Tenant-ID
  • token auth via Authorization: Bearer ... or X-API-Key
  • idempotent POST job creation via Idempotency-Key

Shared agent options

  • --question, -q
  • --model, -m (used by run-agentic-agent, and by run-benchmark when --vastai is disabled)
  • --ollama-warmup, --no-ollama-warmup
  • --ollama-warmup-timeout
  • --kubeconfig

Vast.ai options (enabled only when --vastai is set)

  • --vastai
  • --vastai-template
  • --vastai-min-dph
  • --vastai-max-dph
  • --vastai-reliability

Use vastai-list-models to inspect the configured Vast.ai model shortlist.

Existing server mode

  • --ollama-url (required when --vastai is not enabled)

Use ollama-list-models --ollama-url ... to inspect the exact model names exposed by your existing server.

Running RUNE

Option A: Docker

# Build image
docker build -t ai-benchmark-rune .

# Existing server mode (default)
docker run -it --rm \
  ai-benchmark-rune run-ollama-instance \
  --ollama-url http://host.docker.internal:11434

# Vast.ai mode
docker run -it --rm \
  -v ~/.vast_api_key:/root/.vast_api_key \
  ai-benchmark-rune run-ollama-instance \
  --vastai

# Agent-only mode
docker run -it --rm \
  -v ~/.kube:/root/.kube \
  ai-benchmark-rune run-agentic-agent \
  --question "What is unhealthy?"

# Full benchmark with Vast.ai phase 1
docker run -it --rm \
  -v ~/.vast_api_key:/root/.vast_api_key \
  -v ~/.kube:/root/.kube \
  ai-benchmark-rune run-benchmark \
  --vastai \
  --question "Why is the cluster degraded?"

Option B: Local

pip install -r requirements.txt

# Existing server mode
python -m rune run-ollama-instance --ollama-url http://localhost:11434

# Vast.ai mode
python -m rune run-ollama-instance --vastai

# Show the configured Vast.ai model shortlist
python -m rune vastai-list-models

# Show models exposed by an existing Ollama server
python -m rune ollama-list-models --ollama-url http://localhost:11434

# Agent-only mode
python -m rune run-agentic-agent --question "What is unhealthy?"

# Full benchmark (existing server phase 1)
python -m rune run-benchmark --ollama-url http://localhost:11434 --model llama3.1:8b

# Full benchmark without pre-loading the Ollama model
python -m rune run-benchmark --ollama-url http://localhost:11434 --model llama3.1:8b --no-ollama-warmup

# Full benchmark (Vast.ai phase 1)
python -m rune run-benchmark --vastai --question "What is unhealthy?"

Testing

Automated tests (safe/offline)

Automated tests are designed to run anywhere without creating cloud resources. They mock Ollama and Vast.ai boundaries.

pip install -r requirements.txt
python -m pytest -q

Coverage is enforced at a minimum of 97% via pytest configuration.

Coverage table columns mean:

  • Stmts: executable Python statements in the file
  • Miss: statements not executed by tests
  • Cover: percentage covered ((Stmts - Miss) / Stmts)
  • Missing: uncovered line numbers/ranges (for example 144-146 means lines 144, 145, 146)

For a more graphical report, open the generated HTML output at:

  • htmlcov/index.html

Manual tests (cost-incurring)

Vast.ai instance creation/destruction paths should be validated manually, because they can incur real costs.

Example manual run:

python -m rune run-benchmark --vastai --question "What is unhealthy?"

Contributing

See CONTRIBUTING.md.

Security

See SECURITY.md. See compliance targets in rune-docs for the repository's explicit security and compliance targets.

License

GNU General Public License v3.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rune_bench-0.0.0a1.tar.gz (119.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rune_bench-0.0.0a1-py3-none-any.whl (101.1 kB view details)

Uploaded Python 3

File details

Details for the file rune_bench-0.0.0a1.tar.gz.

File metadata

  • Download URL: rune_bench-0.0.0a1.tar.gz
  • Upload date:
  • Size: 119.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rune_bench-0.0.0a1.tar.gz
Algorithm Hash digest
SHA256 a7f4167f4ca14b2be42f03a98db3c17c9ff0d676d16883955983c7c16d4e4a07
MD5 1a285d1eeaa9050371fe9988acd7f69d
BLAKE2b-256 5b3058f82c14eba2d3ccf719561e0ceab502a2cfbf0d39e01fed6e574b075868

See more details on using hashes here.

Provenance

The following attestation bundles were made for rune_bench-0.0.0a1.tar.gz:

Publisher: publish-pypi.yml on lpasquali/rune

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rune_bench-0.0.0a1-py3-none-any.whl.

File metadata

  • Download URL: rune_bench-0.0.0a1-py3-none-any.whl
  • Upload date:
  • Size: 101.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rune_bench-0.0.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 37b3d3a870e77251226ffb2a6b3ca28951035f9b7e9b690009b6225ebb40385d
MD5 6aa36dce2077690b489301198a64526f
BLAKE2b-256 05d22b5afbef802b923ca5ff349332b88656fa11edc537ac0b44e69fb07edda6

See more details on using hashes here.

Provenance

The following attestation bundles were made for rune_bench-0.0.0a1-py3-none-any.whl:

Publisher: publish-pypi.yml on lpasquali/rune

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page