Skip to main content

Anthropic client plugin for space-evals

Project description

space-evals-client-anthropic

Anthropic client plugin for space-evals. Lets you use Anthropic models (Claude Sonnet, Claude Opus, Claude Haiku, etc.) as the target, customer, or judge in your space-evals test suite.

Installation

pip install space-evals-client-anthropic

This installs the core space-evals package and the anthropic SDK as dependencies.

Setup

Set your API key:

export ANTHROPIC_API_KEY=sk-ant-...

Or use a custom env var name in your config (see below).

Usage

Via config file

Add provider: anthropic to any client role in your space-evals.yaml:

clients:
  target:
    provider: anthropic
    model: claude-sonnet-4-20250514

  customer:
    provider: anthropic
    model: claude-haiku-4-5-20251001

  judge:
    provider: anthropic
    model: claude-sonnet-4-20250514

Then run your tests:

space-evals run ./tests/

Via programmatic API

import asyncio
from space_evals.engine import run_tests
from space_evals.clients.base import get_client

target = get_client("anthropic", model="claude-sonnet-4-20250514")

result = asyncio.run(run_tests(
    path="./tests/",
    target_client=target,
))

Configuration

Field Type Default Description
provider string -- Must be "anthropic"
model string -- Model ID (e.g. claude-sonnet-4-20250514, claude-opus-4-20250514, claude-haiku-4-5-20251001)
api_key_env string ANTHROPIC_API_KEY Env var containing your API key
params object {} Extra parameters passed to the Anthropic API (e.g. temperature, max_tokens)

Example with extra params

clients:
  target:
    provider: anthropic
    model: claude-sonnet-4-20250514
    api_key_env: MY_ANTHROPIC_KEY
    params:
      temperature: 0
      max_tokens: 4096

What this plugin does

  • Translates space-evals's Message format to Anthropic's messages API
  • Extracts tool use blocks from responses into space-evals's ToolCall format
  • Handles system prompts via Anthropic's system parameter (used by the customer and judge roles)
  • Lazy-loads the Anthropic SDK on first use
  • Defaults max_tokens to 1024 if not specified

Requirements

  • Python >= 3.11
  • space-evals >= 0.1.0
  • anthropic >= 0.30
  • ANTHROPIC_API_KEY environment variable (or custom via api_key_env)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

space_evals_client_anthropic-0.1.0.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

space_evals_client_anthropic-0.1.0-py3-none-any.whl (3.8 kB view details)

Uploaded Python 3

File details

Details for the file space_evals_client_anthropic-0.1.0.tar.gz.

File metadata

File hashes

Hashes for space_evals_client_anthropic-0.1.0.tar.gz
Algorithm Hash digest
SHA256 397d587c71a8c9471a51e7c39b3cd5e3bca4c9b92b9bd14a78a1af3ded4d9a21
MD5 4e8c65a5536e01fd4858c28c983bbb1a
BLAKE2b-256 e49ec57d17d33d992a0cd28c3d25541ce631ce5570a95e99b54d86e694c9969a

See more details on using hashes here.

Provenance

The following attestation bundles were made for space_evals_client_anthropic-0.1.0.tar.gz:

Publisher: publish.yaml on Raghav-Sahai/Evals

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file space_evals_client_anthropic-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for space_evals_client_anthropic-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9d305dcb6e58da3517578b4635369a299fa88a3d2d1b8c28750879d2a89870fb
MD5 e825bfc7ce79ce8b1517c39bb46b26ad
BLAKE2b-256 a6cbf2dd17c27b5242872f3558ef20322316bd6d89070879bb9a100703ad5857

See more details on using hashes here.

Provenance

The following attestation bundles were made for space_evals_client_anthropic-0.1.0-py3-none-any.whl:

Publisher: publish.yaml on Raghav-Sahai/Evals

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page