Skip to main content

OpenTelemetry GenAI Evaluations

Project description

This package provides the base evaluation manager and builtin evaluators for opentelemetry-util-genai. It is loaded dynamically by the core GenAI telemetry utilities via the completion callback plugin mechanism.

Features

  • Evaluation Manager - Background processing of LLM and agent evaluations

  • Concurrent Processing - Multi-worker parallel evaluation for high throughput

  • Bounded Queue - Backpressure support to prevent memory exhaustion

  • Async Evaluation - Native async support for LLM-as-a-Judge evaluators

Concurrent Evaluation Mode

Enable concurrent processing for improved throughput with LLM-based evaluations:

# Enable concurrent mode with 4 workers
export OTEL_INSTRUMENTATION_GENAI_EVALS_CONCURRENT=true
export OTEL_INSTRUMENTATION_GENAI_EVALS_WORKERS=4

# Optional: Bounded queue for backpressure
export OTEL_INSTRUMENTATION_GENAI_EVALS_QUEUE_SIZE=100

Sequential Mode (Default):

  • Single worker thread processes evaluations one at a time

  • Guaranteed ordering of evaluation results

  • Lower resource consumption

Concurrent Mode:

  • Multiple worker threads with asyncio event loops

  • Parallel LLM API calls for faster evaluation throughput

  • Recommended for LLM-as-a-Judge evaluators (e.g., DeepEval)

Creating Custom Evaluators

Implement the Evaluator base class and optionally override async methods for native async support:

from opentelemetry.util.genai.evals.base import Evaluator, EvaluationResult

class MyEvaluator(Evaluator):
    @property
    def supports_async(self) -> bool:
        return True  # Enable native async evaluation

    async def evaluate_llm_async(self, invocation):
        # Your async evaluation logic
        return [EvaluationResult(name="my_metric", score=0.95)]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

splunk_otel_util_genai_evals-0.1.8.tar.gz (44.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

splunk_otel_util_genai_evals-0.1.8-py3-none-any.whl (44.7 kB view details)

Uploaded Python 3

File details

Details for the file splunk_otel_util_genai_evals-0.1.8.tar.gz.

File metadata

File hashes

Hashes for splunk_otel_util_genai_evals-0.1.8.tar.gz
Algorithm Hash digest
SHA256 e432f57a33fd6ae6920f88465187e1a574dd88284e65417a3522684f1aa7e17d
MD5 81b81b983a84baff61709fa3403ba071
BLAKE2b-256 16fa02f71fd3612b9301b5c1b1a56e0dd55efee70235b70a457f4cc25307a8b8

See more details on using hashes here.

File details

Details for the file splunk_otel_util_genai_evals-0.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for splunk_otel_util_genai_evals-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 9b234f3835d178cf0c90175cd821b3405a3e01129db65fe955c9a1c6152a4fb0
MD5 4077b38e372a909a0ad34be7e955180b
BLAKE2b-256 22ed22a03dd67c6fe4fe9bc4f43976e85a059c0a251480ddaf30c9b9a5b398a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page