Skip to main content

SAGE Benchmark - SAGE framework-specific system-level benchmarks

Project description

benchmark_sage – SAGE System-Level Benchmarks and ICML Artifacts

benchmark_sage is a home for system-level benchmarks and artifacts that focus on SAGE as a complete ML systems platform.

Key points:

  • SAGE is more than an LLM control plane. The LLM/embedding control plane is one subsystem. SAGE also includes components such as sage.db, sage.flow, sage.tsdb, and others, all orchestrated via a common declarative dataflow model.
  • packages/sage-benchmark already contains multiple benchmark suites (agents, control-plane scheduling, DB, retrieval, memory, schedulers, refiner, libamm, etc.). benchmark_sage can aggregate cross-cutting experiments that involve several SAGE subsystems together.
  • This folder may also store ICML writing prompts and experiment templates for the SAGE system track papers, under docs/.

Suggested uses:

  • End-to-end experiments that span sage.flow pipelines, sage.db storage, sage.tsdb time-series monitoring, and the LLM/embedding control plane.
  • Configs (config/*.yaml) for system-track experiments described in an ICML paper.
  • Notebook or script entry points that reproduce figures/tables.

Q-style Workload Catalog (TPC-H/TPC-C inspired)

benchmark_sage adopts a fixed Q1..Q8 catalog where each Q denotes a workload family rather than a one-off script. This keeps paper claims, configs, and run outputs aligned.

Query Name Entry Workload Family
Q1 PipelineChain e2e_pipeline End-to-end RAG pipeline workloads
Q2 ControlMix control_plane Mixed LLM+embedding scheduling workloads
Q3 NoisyNeighbor isolation Multi-tenant interference and isolation workloads
Q4 ScaleFrontier scalability Scale-out throughput/latency workloads
Q5 HeteroResilience heterogeneity Heterogeneous deployment and recovery workloads
Q6 BurstTown burst_priority Bursty mixed-priority transactional workloads
Q7 ReconfigDrill reconfiguration Online reconfiguration drill workloads
Q8 RecoverySoak recovery Fault-recovery soak workloads

Examples:

# Run a single workload against the default SAGE backend
python -m sage.benchmark.benchmark_sage --experiment Q1

# Run all workloads
python -m sage.benchmark.benchmark_sage --all

# Quick smoke-test
python -m sage.benchmark.benchmark_sage --experiment Q3 --quick
python -m sage.benchmark.benchmark_sage --experiment Q7 --quick

# Backend comparison: same workload, two backends, for fair comparison
python -m sage.benchmark.benchmark_sage --experiment Q1 --backend sage --repeat 3 --seed 42
python -m sage.benchmark.benchmark_sage --experiment Q1 --backend ray  --repeat 3 --seed 42

# Distributed run: 4 nodes, 8-way operator parallelism
python -m sage.benchmark.benchmark_sage --experiment Q4 \
    --backend sage --nodes 4 --parallelism 8 --output-dir results/q4_scale

# Validate config without running
python -m sage.benchmark.benchmark_sage --experiment Q2 --dry-run

Standardised CLI flags (Issue #2)

All workload entry points share the same flag contract so backend comparison runs always produce comparable run_config records.

Flag Default Description
--backend {sage,ray} sage Runtime backend
--nodes N 1 Worker nodes for distributed execution
--parallelism P 2 Operator parallelism hint
--repeat R 1 Independent repetitions (averaged in results)
--seed SEED 42 Global RNG seed for reproducibility
--output-dir DIR results Root directory for artefacts
--quick off Reduced-scale smoke-test run
--dry-run off Validate config, skip execution
--verbose / -v off Enable debug output

Individual workloads may add extra flags on top of the shared contract.

At the repo root, docs/icml-prompts/ contains reusable writing prompts. You can either reference them directly or copy customized versions into this folder when preparing a specific ICML submission.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

isage_sage_benchmark-0.1.0.4.tar.gz (997.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isage_sage_benchmark-0.1.0.4-py2.py3-none-any.whl (1.2 MB view details)

Uploaded Python 2Python 3

File details

Details for the file isage_sage_benchmark-0.1.0.4.tar.gz.

File metadata

  • Download URL: isage_sage_benchmark-0.1.0.4.tar.gz
  • Upload date:
  • Size: 997.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for isage_sage_benchmark-0.1.0.4.tar.gz
Algorithm Hash digest
SHA256 09fd8d520e6ec8c2fb97b8320e6ee1505375aa0f7e7baacca9490850bf274f68
MD5 00f61d8f946e4ed2bda4b7a6619e156f
BLAKE2b-256 40cfb75d5f5ac063cc3e913931e0b13c51f533d23c5ad6d6755caa95bbda5b54

See more details on using hashes here.

File details

Details for the file isage_sage_benchmark-0.1.0.4-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for isage_sage_benchmark-0.1.0.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 a58f1fb0cd97cca456c4177f275542310f930924b641945b3ec56ee09dac4011
MD5 d8e14799efbbcebcf0bbcc99c036e59f
BLAKE2b-256 5b49bfc8c5f9cde5c9c6b10bb95214bf2ff637c21bd5eaa576c801e98ad1f372

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page