Skip to main content

OpenTelemetry GenAI Evaluations

Project description

This package provides the base evaluation manager and builtin evaluators for opentelemetry-util-genai. It is loaded dynamically by the core GenAI telemetry utilities via the completion callback plugin mechanism.

Features

  • Evaluation Manager - Background processing of LLM and agent evaluations

  • Concurrent Processing - Multi-worker parallel evaluation for high throughput

  • Bounded Queue - Backpressure support to prevent memory exhaustion

  • Async Evaluation - Native async support for LLM-as-a-Judge evaluators

Concurrent Evaluation Mode

Enable concurrent processing for improved throughput with LLM-based evaluations:

# Enable concurrent mode with 4 workers
export OTEL_INSTRUMENTATION_GENAI_EVALS_CONCURRENT=true
export OTEL_INSTRUMENTATION_GENAI_EVALS_WORKERS=4

# Optional: Bounded queue for backpressure
export OTEL_INSTRUMENTATION_GENAI_EVALS_QUEUE_SIZE=100

Sequential Mode (Default):

  • Single worker thread processes evaluations one at a time

  • Guaranteed ordering of evaluation results

  • Lower resource consumption

Concurrent Mode:

  • Multiple worker threads with asyncio event loops

  • Parallel LLM API calls for faster evaluation throughput

  • Recommended for LLM-as-a-Judge evaluators (e.g., DeepEval)

Creating Custom Evaluators

Implement the Evaluator base class and optionally override async methods for native async support:

from opentelemetry.util.genai.evals.base import Evaluator, EvaluationResult

class MyEvaluator(Evaluator):
    @property
    def supports_async(self) -> bool:
        return True  # Enable native async evaluation

    async def evaluate_llm_async(self, invocation):
        # Your async evaluation logic
        return [EvaluationResult(name="my_metric", score=0.95)]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

splunk_otel_util_genai_evals-0.1.7.tar.gz (33.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

splunk_otel_util_genai_evals-0.1.7-py3-none-any.whl (31.7 kB view details)

Uploaded Python 3

File details

Details for the file splunk_otel_util_genai_evals-0.1.7.tar.gz.

File metadata

File hashes

Hashes for splunk_otel_util_genai_evals-0.1.7.tar.gz
Algorithm Hash digest
SHA256 84b6e6488335785ed3a10ec65b2d588cdf06b8f7e22c98347017dc4d064f8025
MD5 2d9e0bf3907b9070a22e967c70bede9a
BLAKE2b-256 ecdbc224f3378ad757b882d9e5f694114c31554900ff8d071de9d8ebc0c5c139

See more details on using hashes here.

File details

Details for the file splunk_otel_util_genai_evals-0.1.7-py3-none-any.whl.

File metadata

File hashes

Hashes for splunk_otel_util_genai_evals-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 516db5db76c51aba2ef5623650c57bee93b8c697eb59e3fd46b9fc0580c80361
MD5 9750863e95100e8d92026d3789568b09
BLAKE2b-256 c1d51f4bd702d38f81ca990c055341e7a1ea65b768de0f4652033357067b9168

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page