Skip to main content

Python SDK for the Ashr Labs API

Project description

Ashr Labs Python SDK

A Python client library for evaluating AI agents against Ashr Labs test datasets.

Documentation

Installation

pip install ashr-labs

Quick Start

from ashr_labs import AshrLabsClient, EvalRunner

# Only need your API key — base_url and tenant_id are automatic
client = AshrLabsClient(api_key="tp_your_api_key_here")

# Fetch a dataset and run your agent against it
runner = EvalRunner.from_dataset(client, dataset_id=42)
run = runner.run(my_agent)

# Submit results — grading happens server-side
created = run.deploy(client, dataset_id=42)

# Wait for grading to complete (typically 1-3 minutes)
graded = client.poll_run(created["id"])
metrics = graded["result"]["aggregate_metrics"]
print(f"Passed: {metrics['tests_passed']}/{metrics['total_tests']}")

Your agent just needs two methods:

class MyAgent:
    def respond(self, message: str) -> dict:
        # Call your LLM, return {"text": "...", "tool_calls": [...]}
        return {"text": "response", "tool_calls": []}

    def reset(self) -> None:
        # Clear conversation history between scenarios
        pass

See Testing Your Agent for a full end-to-end guide.

Available Methods

All methods that accept tenant_id auto-resolve it from your API key if omitted.

Datasets

Method Description
get_dataset(dataset_id, ...) Get a dataset by ID
list_datasets(limit, offset, ...) List datasets

Runs

Method Description
create_run(dataset_id, result, ...) Create a new test run
get_run(run_id) Get a run by ID
list_runs(dataset_id, limit, offset) List runs
delete_run(run_id) Delete a run
poll_run(run_id, timeout, poll_interval) Wait for server-side grading to complete

EvalRunner

Method Description
EvalRunner.from_dataset(client, dataset_id) Create a runner from a dataset
runner.run(agent, max_workers=1, on_environment=...) Run agent against all scenarios, return RunBuilder
runner.run_and_deploy(agent, client, dataset_id, max_workers=1) Run and submit in one call

RunBuilder

Method Description
RunBuilder() Create a new run builder
run.start() Mark the run as started
run.add_test(test_id) Add a test and get a TestBuilder
run.complete(status) Mark the run as completed
run.build() Serialize to a result dict
run.deploy(client, dataset_id) Build and submit via the API

TestBuilder

Method Description
test.start() Mark the test as started
test.add_user_file(file_path, description) Record a user file upload
test.add_user_text(text, description) Record a user text input
test.add_tool_call(expected, actual, match_status) Record an agent tool call
test.add_agent_response(expected_response, actual_response, match_status) Record an agent response
test.complete(status) Mark the test as completed

Requests

Method Description
create_request(request_name, request, ...) Create a new request
get_request(request_id) Get a request by ID
list_requests(status, limit, offset) List requests

API Keys & Session

Method Description
init() Validate credentials and get user/tenant info
list_api_keys(include_inactive) List API keys for your tenant
revoke_api_key(api_key_id) Revoke an API key
health_check() Check if the API is reachable

Error Handling

from ashr_labs import AshrLabsClient, NotFoundError, AuthenticationError

client = AshrLabsClient(api_key="tp_...")

try:
    dataset = client.get_dataset(dataset_id=999)
except AuthenticationError:
    print("Invalid API key")
except NotFoundError:
    print("Dataset not found")

Configuration

# All defaults — just pass API key
client = AshrLabsClient(api_key="tp_...")

# From environment (reads ASHR_LABS_API_KEY)
client = AshrLabsClient.from_env()

# Custom timeout
client = AshrLabsClient(api_key="tp_...", timeout=60)

# Custom base URL (for self-hosted)
client = AshrLabsClient(api_key="tp_...", base_url="https://your-api.example.com")

Requirements

  • Python 3.10+
  • No external dependencies (uses only standard library)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ashr_labs-0.1.5.tar.gz (43.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ashr_labs-0.1.5-py3-none-any.whl (40.0 kB view details)

Uploaded Python 3

File details

Details for the file ashr_labs-0.1.5.tar.gz.

File metadata

  • Download URL: ashr_labs-0.1.5.tar.gz
  • Upload date:
  • Size: 43.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ashr_labs-0.1.5.tar.gz
Algorithm Hash digest
SHA256 b0db8957f47e0167619cb8912e3298237a8ce14eb1dc37f60ce6a0bc808efc65
MD5 2f0e9600fd87a853012b01bda8b1a83e
BLAKE2b-256 5011fd89a91636a6116b626dc3d9b76d225fa17265f67e41a275bf4c0afec06b

See more details on using hashes here.

File details

Details for the file ashr_labs-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: ashr_labs-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 40.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ashr_labs-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 a28c1a619264c20f8ea157a816ba286ff2539afa8b0adfe2f69d1e38858cd91d
MD5 254234d0baf97b829775204089f70130
BLAKE2b-256 3ec681e9f3c38efbd76bb12f071d9ee1a5c150bfb3ed61164f6e9dc245c8e9a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page