Skip to main content

DeepEval integration adapter for Metrics Computation Engine

Project description

MCE DeepEval Adapter

A Python adapter library that integrates DeepEval metrics as third-party plugins for the Metric Computation Engine (MCE). This adapter enables seamless use of DeepEval's LLM evaluation metrics within the MCE framework for evaluating agentic applications.

Installation

Install via MCE extras:

pip install "metrics-computation-engine[deepeval]"

Prerequisites

Supported DeepEval Metrics

The following DeepEval metrics are supported by this adapter (use with the deepeval. prefix in service payloads):

Metric Name Description
AnswerRelevancyMetric Measures how relevant the model answer is to the user query
RoleAdherenceMetric Evaluates adherence to specified roles across a conversation
TaskCompletionMetric Assesses whether the task was completed given tool calls and responses
ConversationCompletenessMetric Evaluates whether the conversation covered necessary elements
BiasMetric Detects various forms of bias in responses
CoherenceMetric Scores coherence and logical flow of the output
GroundednessMetric Evaluates how well outputs are grounded in the provided input/context
TonalityMetric Evaluates tone and stylistic appropriateness of the output
ToxicityMetric Identifies toxic or unsafe content in outputs
AnswerCorrectnessMetric Measures correctness of the answer versus expected output
GeneralStructureAndStyleMetric Evaluates structure and style quality of the output

For requests to support additional DeepEval metrics, please file an issue in our repo: agntcy/telemetry-hub.

Usage

Basic Usage

import asyncio
from mce_deepeval_adapter.adapter import DeepEvalMetricAdapter
from metrics_computation_engine.models.requests import LLMJudgeConfig
from metrics_computation_engine.registry import MetricRegistry

# Initialize LLM configuration
llm_config = LLMJudgeConfig(
    LLM_BASE_MODEL_URL="https://api.openai.com/v1",
    LLM_MODEL_NAME="gpt-4o",
    LLM_API_KEY="your-api-key-here"
)

# Create registry and register DeepEval metrics
registry = MetricRegistry()

# Method 1: Direct registration with metric name
registry.register_metric(DeepEvalMetricAdapter, "AnswerRelevancyMetric")

# Method 2: Using get_metric_class helper with prefix
from metrics_computation_engine.util import get_metric_class
metric, metric_name = get_metric_class("deepeval.RoleAdherenceMetric")
registry.register_metric(metric, metric_name)

Using with MCE REST API

When using the MCE as a service, include DeepEval metrics in your API request:

{
  "metrics": [
    "deepeval.AnswerRelevancyMetric",
    "deepeval.RoleAdherenceMetric",
  ],
  "llm_judge_config": {
    "LLM_API_KEY": "your-api-key",
    "LLM_MODEL_NAME": "gpt-4o",
    "LLM_BASE_MODEL_URL": "https://api.openai.com/v1"
  },
  "data_fetching_infos": {
    "batch_config": {
      "time_range": { "start": "2024-01-01T00:00:00Z", "end": "2024-12-31T23:59:59Z" }
    },
    "session_ids": []
  }
}

Contributing

Contributions are welcome! Please follow these steps to contribute:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature-branch).
  3. Commit your changes (git commit -am 'Add new feature').
  4. Push to the branch (git push origin feature-branch).
  5. Create a new Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mce_deepeval_adapter-0.1.4.tar.gz (230.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mce_deepeval_adapter-0.1.4-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file mce_deepeval_adapter-0.1.4.tar.gz.

File metadata

  • Download URL: mce_deepeval_adapter-0.1.4.tar.gz
  • Upload date:
  • Size: 230.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mce_deepeval_adapter-0.1.4.tar.gz
Algorithm Hash digest
SHA256 4f32b8695dc197d51d6c32b272f77aa0b8ea39dc514912e3d044b65671116aa5
MD5 2703e698ebf7505f8257e1d1e1ce796e
BLAKE2b-256 15631bcf6d576348d72bdcd8bf74df75ba0c58b203c442cc12a4e5edc4540e87

See more details on using hashes here.

File details

Details for the file mce_deepeval_adapter-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for mce_deepeval_adapter-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 0ca6eab5167376795a9b0a16847b7ee65567e78d05a52f2e2ca769e8cc17bae0
MD5 deb63886ebf3221c7c8e51b47e0d2d47
BLAKE2b-256 a7bc607650ac4e8154aa1042970cb84f407a0437b8d8dcc92d27725d4d67abae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page