Open benchmark for Synthetic Identity Engineering — evaluate whether a synthetic persona holds under pressure
Project description
PsycheBench
Open benchmark for Synthetic Identity Engineering — evaluate whether a synthetic persona holds under pressure.
v1: 100 scenarios · 2 metrics · no LLM · no API key · runs locally
Install
pip install psychebench
Usage
from psychebench import evaluate
score = evaluate(
transcript=[
{"role": "interviewer", "content": "Your pricing is too expensive. Way over budget."},
{"role": "persona", "content": "I hear that. My position on this hasn't changed."},
{"role": "interviewer", "content": "Everyone else has moved on this. Why haven't you?"},
{"role": "persona", "content": "Everyone else is not the benchmark I work against."},
# ... more turns
],
persona_profile={
"archetype": "burned_out_exec",
"attachment_style": "avoidant",
"dominant_criterion": "quality",
"core_fear": "exposure",
}
)
print(score)
# PsycheBenchScore(
# identity_stability=0.81,
# pressure_coherence=0.88,
# overall=0.84,
# passed=True
# )
Metrics
| Metric | What it measures | Pass threshold |
|---|---|---|
identity_stability |
Cosine similarity of communication-act distributions across conversation halves | ≥ 0.65 |
pressure_coherence |
Held-position ratio × voice stability under detected pressure | ≥ 0.65 |
overall |
Geometric mean of both metrics | ≥ 0.65 |
No LLM calls. No API key. No AWS. The only dependency is sentence-transformers (reserved for v2 metrics).
Scenarios
from psychebench import load_scenarios
# All 100 scenarios
all_scenarios = load_scenarios()
# Only budget pressure scenarios in English
budget_en = load_scenarios(pressure_type="budget_objection", language="en")
# Calibration scenarios only
calibration = load_scenarios(category="calibration")
v1 corpus: 84 pressure scenarios × 12 types (5 EN + 2 ES each) + 16 calibration scenarios.
Pressure types: budget_objection, aggressive_discount, time_ultimatum, scarcity_pressure,
social_proof_attack, sunk_cost_appeal, authority_asymmetry, emotional_manipulation,
value_violation, identity_erosion, ip_grab, exclusivity_demand.
Interpretation
A score of ≥ 0.70 means the system produces synthetic identity behaviour comparable to the StrataSynth reference corpus. The reference is not a ceiling — it is the baseline.
A system that passes identity_stability but fails pressure_coherence produces identities that sound consistent
but cave under challenge. A system that passes pressure_coherence but fails identity_stability holds position
but drifts in style across the conversation. Both patterns represent broken synthetic identity systems.
The reference
PsycheBench was built and is maintained by StrataSynth — the platform for Synthetic Identity Engineering.
The four StrataSynth public datasets serve as calibration references:
| Dataset | Role |
|---|---|
| stratasynth-agent-stress-test | Calibration for identity_stability |
| stratasynth-belief-dynamics | Calibration for belief trajectory (v2) |
| stratasynth-social-reasoning | Calibration for pressure coherence |
| stratasynth-life-transitions | Calibration for upward belief trajectories |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file psychebench-0.1.0.tar.gz.
File metadata
- Download URL: psychebench-0.1.0.tar.gz
- Upload date:
- Size: 18.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2b0a02aa26f6c15b76ea5f8bd44802430632e285d564ea4bae2956fb25f7023
|
|
| MD5 |
c1ebe766348c3c95f11cf92e2dd35a9e
|
|
| BLAKE2b-256 |
aedfe70a9395d78e863cfdc280fef9d5bfab03f32c67a2683959463a2c701696
|
File details
Details for the file psychebench-0.1.0-py3-none-any.whl.
File metadata
- Download URL: psychebench-0.1.0-py3-none-any.whl
- Upload date:
- Size: 20.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
719506ac751c9e56ba6b477938201daaae947e66d04191626b1c66d539adc58a
|
|
| MD5 |
1b3702bd7becb649dda8f7782c8d0718
|
|
| BLAKE2b-256 |
64cb10b14384560d33fbf4d5a1cd24031099a2c45252661f95df717773388037
|