Skip to main content

Pytest-style framework for evaluating Model Context Protocol (MCP) servers.

Project description

MCP-Eval: An Evaluation Framework for MCP Servers

MCP-Eval is a developer-first testing framework for Model Context Protocol (MCP) servers, built on the mcp-agent library. It enables you to write clear, concise, and powerful tests to evaluate the performance, reliability, and correctness of your AI agents and the MCP servers they connect to.

Core Features

  • Task-Based Testing: Define tests as async functions where an agent performs a task.
  • Automatic Metrics: Automatically collect detailed metrics on latency, token usage, cost, and tool calls for every test run.
  • Rich Assertions: A powerful set of assertions designed for AI testing, including:
    • contains(): Checks for substrings in responses.
    • tool_was_called(): Verifies that a specific tool was used.
    • tool_arguments_match(): Checks if a tool was called with the correct arguments.
    • cost_under(): Asserts that a test run stays within a defined cost budget.
    • number_of_steps_under(): Ensures an agent completes a task efficiently.
    • objective_succeeded(): Uses an LLM to verify if the agent's response achieved the overall goal.
    • plan_is_efficient(): Uses an LLM to check for redundant or inefficient steps in the agent's execution path.
  • Tool Coverage Reporting: Automatically calculates the percentage of a server's tools that are exercised by your test suite.
  • Automated Test Generation: A CLI tool to automatically generate a baseline test suite for any MCP server.
  • Detailed Reports: Get immediate feedback from rich console reports and generate detailed JSON reports for CI/CD or further analysis.

Getting Started

1. Installation

Install mcp_eval and its dependencies. Make sure mcp-agent is also installed in your environment.

pip install "typer[all]" rich pydantic jinja2

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_unit-0.1.1.tar.gz (298.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_unit-0.1.1-py3-none-any.whl (292.7 kB view details)

Uploaded Python 3

File details

Details for the file mcp_unit-0.1.1.tar.gz.

File metadata

  • Download URL: mcp_unit-0.1.1.tar.gz
  • Upload date:
  • Size: 298.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.5

File hashes

Hashes for mcp_unit-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2a2b66fdf439043903d100b89ed411a056b855cd64c74b3c7cf290bed0b52fbe
MD5 199a06a5f6e689bd56b1d93ce0257fc3
BLAKE2b-256 8ab605a8ed93c990e6aa7e4110c0e34644e41205937a2f508fd3786cdf5dc46e

See more details on using hashes here.

File details

Details for the file mcp_unit-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mcp_unit-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 292.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.5

File hashes

Hashes for mcp_unit-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f6e9fd38754c7bdd4ef2039e16d10b5e47179f397c6f95f02fc05b2dcfb060ef
MD5 227a2c358e98eb864c37442bf22730cd
BLAKE2b-256 0fdf1396a069211df4e63f5f9684461e306dc2e60af103ec9f128c9cb969f241

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page