Skip to main content

SAGE Evaluation Framework - Metrics, profilers, and judges for AI systems

Project description

sage-eval

SAGE L3 评估组件库,提供可组合的 MetricProfilerLLM Judge 实现。

  • PyPI: isage-eval
  • Import: sage_libs.sage_eval
  • Layer: L3 (evaluation library, not benchmark)

边界与约束

  • 本仓库只提供评测组件实现,不承载执行平台调度职责。
  • 依赖 sage.libs.eval 接口层并在导入时注册实现。
  • 采用 fail-fast:不保留 fallback / shim / 兼容分支。

当前组件

Metrics

  • AccuracyMetric
  • BLEUMetric
  • F1Metric

Profilers

  • LatencyProfiler
  • ThroughputProfiler

Judges

  • FaithfulnessJudge
  • RelevanceJudge

安装

pip install isage-eval

快速使用

from sage_libs.sage_eval import AccuracyMetric, LatencyProfiler, RelevanceJudge

metric = AccuracyMetric()
metric_result = metric.compute([1, 0, 1], [1, 1, 1])

profiler = LatencyProfiler()
with profiler:
    _ = sum(range(10000))

def mock_llm(_: str) -> str:
    return "SCORE: 0.9\nREASONING: Relevant response."

judge = RelevanceJudge(llm_fn=mock_llm)
judge_result = judge.judge(
    response="Paris is the capital of France.",
    question="What is the capital of France?",
)

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isage_eval-0.2.0-py2.py3-none-any.whl (33.6 kB view details)

Uploaded Python 2Python 3

File details

Details for the file isage_eval-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: isage_eval-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 33.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for isage_eval-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 288d00b0b5441cb448facc07fedf8d81f682381b8dcfac3f7738a2a2235f40dd
MD5 9762f4508ae83d97b93724efce1986e2
BLAKE2b-256 5b0494cac4a82c8e2e50720f94eaeb850d759ae89002a00e946047b2a9203914

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page