SAGE Benchmark - RAG and experimental benchmarks for SAGE framework

These details have not been verified by PyPI

Project links

Project description

SAGE Benchmark

Comprehensive benchmarking tools and RAG examples for the SAGE framework

📋 Overview

SAGE Benchmark provides a comprehensive suite of benchmarking tools and RAG (Retrieval-Augmented Generation) examples for evaluating SAGE framework performance. This package enables researchers and developers to:

Benchmark RAG pipelines with multiple retrieval strategies (dense, sparse, hybrid)
Compare vector databases (Milvus, ChromaDB, FAISS) for RAG applications
Evaluate multimodal retrieval with text, image, and video data
Run reproducible experiments with standardized configurations and metrics

This package is designed for both research experiments and production system evaluation.

✨ Key Features

Multiple RAG Implementations: Dense, sparse, hybrid, and multimodal retrieval
Vector Database Support: Milvus, ChromaDB, FAISS integration
Experiment Framework: Automated benchmarking with configurable experiments
Evaluation Metrics: Comprehensive metrics for RAG performance
Sample Data: Included test data for quick start
Extensible Design: Easy to add new benchmarks and retrieval methods

📦 Package Structure

sage-benchmark/
├── src/
│   └── sage/
│       └── benchmark/
│           ├── __init__.py
│           └── benchmark_rag/           # RAG benchmarking
│               ├── __init__.py
│               ├── implementations/     # RAG implementations
│               │   ├── pipelines/      # RAG pipeline scripts
│               │   │   ├── qa_dense_retrieval_milvus.py
│               │   │   ├── qa_sparse_retrieval_milvus.py
│               │   │   ├── qa_multimodal_fusion.py
│               │   │   └── ...
│               │   └── tools/          # Supporting tools
│               │       ├── build_chroma_index.py
│               │       ├── build_milvus_dense_index.py
│               │       └── loaders/
│               ├── evaluation/          # Experiment framework
│               │   ├── pipeline_experiment.py
│               │   ├── evaluate_results.py
│               │   └── config/
│               ├── config/              # RAG configurations
│               └── data/                # Test data
│           # Future benchmarks:
│           # ├── benchmark_agent/      # Agent benchmarking
│           # └── benchmark_anns/       # ANNS benchmarking
├── tests/
├── pyproject.toml
└── README.md

🚀 Installation

Install the benchmark package:

pip install -e packages/sage-benchmark

Or with development dependencies:

pip install -e "packages/sage-benchmark[dev]"

Note: The sage.data module is included as a submodule in the package and will be installed automatically. It contains datasets for various benchmarks including LibAMM datasets.

📊 RAG Benchmarking

The benchmark_rag module provides comprehensive RAG benchmarking capabilities:

RAG Implementations

Various RAG approaches for performance comparison:

Vector Databases:

Milvus: Dense, sparse, and hybrid retrieval
ChromaDB: Local vector database with simple setup
FAISS: Efficient similarity search

Retrieval Methods:

Dense retrieval (embeddings-based)
Sparse retrieval (BM25, sparse vectors)
Hybrid retrieval (combining dense + sparse)
Multimodal fusion (text + image + video)

Quick Start

1. Build Vector Index

First, prepare your vector index:

# Build ChromaDB index (simplest)
python -m sage.benchmark.benchmark_rag.implementations.tools.build_chroma_index

# Or build Milvus dense index
python -m sage.benchmark.benchmark_rag.implementations.tools.build_milvus_dense_index

2. Run a RAG Pipeline

Test individual RAG pipelines:

# Dense retrieval with Milvus
python -m sage.benchmark.benchmark_rag.implementations.pipelines.qa_dense_retrieval_milvus

# Sparse retrieval
python -m sage.benchmark.benchmark_rag.implementations.pipelines.qa_sparse_retrieval_milvus

# Hybrid retrieval (dense + sparse)
python -m sage.benchmark.benchmark_rag.implementations.pipelines.qa_hybrid_retrieval_milvus

3. Run Benchmark Experiments

Execute full benchmark suite:

# Run comprehensive benchmark
python -m sage.benchmark.benchmark_rag.evaluation.pipeline_experiment

# Evaluate and generate reports
python -m sage.benchmark.benchmark_rag.evaluation.evaluate_results

4. View Results

Results are saved in benchmark_results/:

experiment_TIMESTAMP/ - Individual experiment runs
metrics.json - Performance metrics
comparison_report.md - Comparison report

📖 Quick Start

Basic Example

from sage.benchmark.benchmark_rag.implementations.pipelines import (
    qa_dense_retrieval_milvus,
)
from sage.benchmark.benchmark_rag.config import load_config

# Load configuration
config = load_config("config_dense_milvus.yaml")

# Run RAG pipeline
results = qa_dense_retrieval_milvus.run_pipeline(query="What is SAGE?", config=config)

# View results
print(f"Retrieved {len(results)} documents")
for doc in results:
    print(f"- {doc.content[:100]}...")

Run Custom Benchmark

from sage.benchmark.benchmark_rag.evaluation import PipelineExperiment

# Define experiment configuration
experiment = PipelineExperiment(
    name="custom_rag_benchmark",
    pipelines=["dense", "sparse", "hybrid"],
    queries=["query1.txt", "query2.txt"],
    metrics=["precision", "recall", "latency"],
)

# Run experiment
results = experiment.run()

# Generate report
experiment.generate_report(results)

Configuration

Configuration files are located in sage/benchmark/benchmark_rag/config/:

config_dense_milvus.yaml - Dense retrieval configuration
config_sparse_milvus.yaml - Sparse retrieval configuration
config_hybrid_milvus.yaml - Hybrid retrieval configuration
config_qa_chroma.yaml - ChromaDB configuration

Experiment configurations in sage/benchmark/benchmark_rag/evaluation/config/:

experiment_config.yaml - Benchmark experiment settings

📖 Data

Test data is included in the package:

Benchmark Data (benchmark_rag/data/):
- queries.jsonl - Sample queries for testing
- qa_knowledge_base.* - Knowledge base in multiple formats (txt, md, pdf, docx)
- sample/ - Additional sample documents for testing
- sample/ - Additional sample documents
Benchmark Config (benchmark_rag/config/):
- experiment_config.yaml - RAG benchmark configurations

🔧 Development

Running Tests

pytest packages/sage-benchmark/

Code Formatting

# Format code
black packages/sage-benchmark/

# Lint code
ruff check packages/sage-benchmark/

📚 Documentation

For detailed documentation on each component:

See src/sage/benchmark/rag/README.md for RAG examples
See src/sage/benchmark/benchmark_rag/README.md for benchmark details

🔮 Future Components

benchmark_agent: Agent system performance benchmarking
benchmark_anns: Approximate Nearest Neighbor Search benchmarking
benchmark_llm: LLM inference performance benchmarking

🤝 Contributing

This package follows the same contribution guidelines as the main SAGE project. See the main repository's CONTRIBUTING.md.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Related Packages

sage-kernel: Core computation engine for running benchmarks
sage-libs: RAG components and utilities
sage-middleware: Vector database services (Milvus, ChromaDB)
sage-common: Common utilities and data types

📮 Support

Documentation: https://intellistream.github.io/SAGE-Pub/guides/packages/sage-benchmark/
Issues: https://github.com/intellistream/SAGE/issues
Discussions: https://github.com/intellistream/SAGE/discussions

Part of the SAGE Framework | Main Repository

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.4

Jan 3, 2026

This version

0.2.3

Jan 3, 2026

0.1.1.20

Mar 12, 2026

0.1.1.19

Mar 8, 2026

0.1.1.18

Mar 4, 2026

0.1.1.17

Mar 4, 2026

0.1.1.16

Mar 4, 2026

0.1.1.15

Mar 3, 2026

0.1.1.14

Mar 3, 2026

0.1.1.13

Feb 24, 2026

0.1.1.12

Feb 24, 2026

0.1.1.11

Feb 24, 2026

0.1.1.10

Feb 24, 2026

0.1.1.9

Feb 24, 2026

0.1.1.8

Feb 23, 2026

0.1.1.7

Feb 23, 2026

0.1.1.6

Feb 23, 2026

0.1.1.5

Feb 23, 2026

0.1.1.4

Feb 23, 2026

0.1.1.3

Feb 23, 2026

0.1.1.2

Feb 23, 2026

0.1.1.1

Feb 21, 2026

0.1.1.0

Feb 21, 2026

0.1.0.9

Feb 21, 2026

0.1.0.8

Feb 21, 2026

0.1.0.7

Feb 21, 2026

0.1.0.6

Feb 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

isage_benchmark-0.2.3-py3-none-any.whl (3.8 MB view details)

Uploaded Jan 3, 2026 Python 3

File details

Details for the file isage_benchmark-0.2.3-py3-none-any.whl.

File metadata

Download URL: isage_benchmark-0.2.3-py3-none-any.whl
Upload date: Jan 3, 2026
Size: 3.8 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for isage_benchmark-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a05cabe5150588b92a07b3072b8964cd161d5f31d583c6868c45f50343be211b`
MD5	`559c9128f88c8d5ddb8733a1b865ca99`
BLAKE2b-256	`08656bdad13f8dc6e4455d54e5f0d8473f29dc64e1a354375d706e8d78907ffe`

See more details on using hashes here.

isage-benchmark 0.2.3

Navigation

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Project description

SAGE Benchmark

📋 Overview

✨ Key Features

📦 Package Structure

🚀 Installation

📊 RAG Benchmarking

RAG Implementations

Quick Start

1. Build Vector Index

2. Run a RAG Pipeline

3. Run Benchmark Experiments

4. View Results

📖 Quick Start

Basic Example

Run Custom Benchmark

Configuration

📖 Data

🔧 Development

Running Tests

Code Formatting

📚 Documentation

🔮 Future Components

🤝 Contributing

📄 License

🔗 Related Packages

📮 Support

Project details

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes