The Tau-Trait package

Project description

Tau-Trait

Collinear AI

Tau-Trait is a benchmark for evaluating large language models (LLMs) with realistic, persona-aware simulations. It builds on Tau-Bench but introduces two key modifications:

TraitBasis-generated personas – more accurate and interpretable user simulations.
Domain-specific evaluation – tasks drawn from retail, airline, telecom, and telehealth settings.

Tau-Trait is designed to test model robustness, personalization, and fairness in high-impact, customer-facing domains where user traits strongly influence interaction quality.

✨ Features

Persona Simulation with TraitBasis Generate diverse, coherent user personas with different traits.
Domain Coverage Tau-Trait includes evaluation tasks in four industries:
- 🛒 Retail
- ✈️ Airline
- 📱 Telecom
- 🩺 Telehealth

🚀 Getting Started

Installation

pip install tau-trait

Usage

import argparse
from tau_trait.types import RunConfig
from tau_trait.run import run
from litellm import provider_list
from tau_trait.envs.user import UserStrategy

from tau_trait.types import RunConfig
from tau_trait.run import run

config = RunConfig(
    model_provider="openai",
    user_model_provider="steer",
    model=CLIENT_ASSISTANT_MODEL_NAME,
    user_model="", # steer api abstracts the model
    num_trials=1,
    env="retail",
    agent_strategy="tool-calling",
    temperature=0.7,
    task_split="test",
    start_index=0,
    end_index=-1,
    task_ids=[4],
    log_dir="results",
    max_concurrency=1,
    seed=10,
    shuffle=0,
    user_strategy="llm",
    few_shot_displays_path=None,
    trait_dict={"impatience": 1, "confusion": 0, "skeptical": 0, "incoherence": 0},
)

Some definitions of the settings are below.

Tau-Hard Config Settings

General

--num-trials (int, default: 1)
Number of independent trials to run.
--seed (int, default: 10)
Random seed for reproducibility.
--shuffle (int, default: 0)
Whether to shuffle task order (0 = no, 1 = yes).
--log-dir (str, default: results)
Directory where logs and results are stored.

Environment & Tasks

--env (str, choices: retail, airline, default: retail)
Domain environment in which to run simulations.
--task-split (str, choices: train, test, dev, default: test)
Dataset split of tasks to run (applies only to the retail domain currently).
--start-index (int, default: 0)
Index of the first task to run.
--end-index (int, default: -1)
Index of the last task to run. Use -1 to run all remaining tasks.
--task-ids (list of int, optional)
Explicit list of task IDs to run (overrides index ranges).

Agent Configuration

--model (str, required)
The model to use for the agent.
--model-provider (str, choices from provider_list)
Provider for the agent’s model.
--agent-strategy (str, choices: tool-calling, act, react, few-shot, default: tool-calling)
Strategy used by the agent to interact with the environment.
- tool-calling: Invoke external tools.
- act: Pure action selection.
- react: Reason + act alternation.
- few-shot: Use few-shot exemplars.
--temperature (float, default: 0.0)
Sampling temperature for the action model (higher = more randomness).
--few-shot-displays-path (str, optional)
Path to a JSONL file containing few-shot demonstration examples.

User Simulator Configuration

--user-model (str, default: gpt-4o)
Model to use for the user simulator.
--user-model-provider (str, optional)
Provider for the user simulator’s model.
--user-strategy (str, choices from UserStrategy, default: llm)
Strategy for the simulated user (e.g., LLM-based).

Execution Controls

--max-concurrency (int, default: 1)
Number of tasks to run in parallel.

@misc{curator-evals,
  author       = {Mackey, Tsach and Shafique, Muhammad Ali and Kumar, Anand},
  title        = {Curator Evals: A Benchmark for High-quality Post-training Data Curation},
  year         = {2025},
  month        = {Sep},
  howpublished = {\url{https://github.com/collinear-ai/curator-evals}}
}

Project details

Release history Release notifications | RSS feed

0.1.1

Sep 23, 2025

This version

0.1.0

Sep 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tau_trait-0.1.0.tar.gz (842.9 kB view details)

Uploaded Sep 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tau_trait-0.1.0-py3-none-any.whl (970.8 kB view details)

Uploaded Sep 23, 2025 Python 3

File details

Details for the file tau_trait-0.1.0.tar.gz.

File metadata

Download URL: tau_trait-0.1.0.tar.gz
Upload date: Sep 23, 2025
Size: 842.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for tau_trait-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fe599fee8cb17da35a194f3bfdf386ade5679566b58fba0ef1cf9ba001a60651`
MD5	`b793b7a872c54d5d909001bdaefc6062`
BLAKE2b-256	`ee9b7b21d59e14e40351fee7eeec109d1e4313eccaac9065a019c57699ee9cdc`

See more details on using hashes here.

File details

Details for the file tau_trait-0.1.0-py3-none-any.whl.

File metadata

Download URL: tau_trait-0.1.0-py3-none-any.whl
Upload date: Sep 23, 2025
Size: 970.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for tau_trait-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8857dfa66090bb4b1725b3a7c5c22506b34dcce57983b5f14df558b0248e51b6`
MD5	`97f36d7647cbffa3b66052e62bba2166`
BLAKE2b-256	`b131009dd6f23f22876b7a45207113a61da54622350bd9d6e2cf1c4347702bda`

See more details on using hashes here.

tau-trait 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project description

Tau-Trait

Collinear AI

✨ Features

🚀 Getting Started

Installation

Usage

Tau-Hard Config Settings

General

Environment & Tasks

Agent Configuration

User Simulator Configuration

Execution Controls

Project details

Verified details

Maintainers

Unverified details

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes