Skip to main content

Prompt Specification Curve Analysis: multiverse analysis for LLM prompt design.

Project description

P-SCA: Prompt Specification Curve Analysis

PyPI License: MIT Python 3.10+ DOI

A specification curve analysis framework for evaluating the robustness of LLM-simulated public opinion across prompt design choices. P-SCA systematically varies six prompt dimensions — model, persona format, question framing, system prompt, temperature, and few-shot examples — to measure how sensitive LLM partisan-gap estimates are to arbitrary researcher decisions. Benchmarked against ANES 2024 ground-truth survey data.

Install

pip install psca

Or from source:

git clone https://github.com/YCRG-Labs/psca && cd psca
pip install -e .

Set API keys in .env:

OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
GOOGLE_API_KEY=...
OPENROUTER_API_KEY=...

Quickstart

# Latin Hypercube sampling — the main run
psca lhs --n_specs 600 --output full_lhs.json

# Reproduce Gemini-excluded headline numbers
psca analyze --output full_lhs.json --exclude_models gemini-2.5-flash

# Derive empirical coverage thresholds (10k permutations)
psca threshold --output full_lhs.json --exclude_models gemini-2.5-flash --n_permutations 10000

# Permutation inference for the partisan signal
psca permutation --output full_lhs.json

# Variance decomposition + Fisher r-to-z dominance
psca analyze --output full_lhs.json
psca fisher --output full_lhs.json

# Bootstrap CIs on eta-squared (5,000 resamples)
psca bootstrap --output full_lhs.json

# ANES benchmark comparison (amplification factor)
psca anes --output full_lhs.json

# Flipped specification analysis
psca flipped --output full_lhs.json

# Saltelli sampling for Sobol sensitivity indices
psca saltelli --items gun_control --output saltelli_gun.json
psca sobol --output saltelli_gun.json

Or use the Python API:

import psca

specs = psca.generate_specifications(n_specs=600, seed=42)
df = psca.load_results("full_lhs.json", exclude_models=["gemini-2.5-flash"])
psca.variance_decomposition(df)
psca.derive_coverage_threshold(df, n_permutations=10000)

Models supported

GPT-5.4, GPT-5.4-nano, Claude Sonnet 4.6, Llama 3.3 70B, Mistral Small. Gemini 2.5 Flash is queried in the same multiverse design but excluded from primary analyses on parse-rate grounds (see paper §4.1).

Project structure

Path Purpose
src/psca/config.py Six prompt dimensions, 20 battleground-state profiles, ANES items, cost tables
src/psca/sampler.py Latin Hypercube and Saltelli specification generators
src/psca/prompts.py Prompt construction from spec + profile + item
src/psca/runner.py Async multi-provider API runner with retries
src/psca/analysis.py Partisan gaps, η², bootstrap CIs, Sobol, permutation tests, ANES benchmarks, threshold derivation
src/psca/cli.py CLI entry point (psca ...)
ordering_test.py Position bias test for forced-choice framing
patch_run.py Reruns failed specifications from a previous run
download_anes.py ANES 2024 data download and processing

Data and logs

Results files in results/*.json are the API call logs. Each record includes the model's raw text reply in the raw_response field alongside the parsed score and full specification metadata (model, persona, framing, system prompt, temperature, few-shot count, profile, item, repeat).

Citation

If you use P-SCA in academic work, please cite both the methodology paper and the software:

@software{crainic_psca_2026,
  author  = {Crainic, Jacob and Yee, Brandon and Koh, Pairie},
  title   = {{P-SCA}: Prompt Specification Curve Analysis},
  year    = {2026},
  version = {0.1.0},
  doi     = {10.5281/zenodo.PENDING},
  url     = {https://github.com/YCRG-Labs/psca}
}

See CITATION.cff for machine-readable metadata.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

psca-0.1.0.tar.gz (23.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

psca-0.1.0-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file psca-0.1.0.tar.gz.

File metadata

  • Download URL: psca-0.1.0.tar.gz
  • Upload date:
  • Size: 23.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for psca-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5a1210da62a95c7c724339d22ee1baec8ff2aa12ef6c30ba20b9889e24d9f785
MD5 6022a5ace0971ddf2264c5aca2c84bd6
BLAKE2b-256 02630cfa498544e9f21635da5d7acfc3140b5f69db2d103b17ca3776603ddbf5

See more details on using hashes here.

Provenance

The following attestation bundles were made for psca-0.1.0.tar.gz:

Publisher: release.yml on YCRG-Labs/psca

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file psca-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: psca-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for psca-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9a7036b6b2f23a10ce4a761444d00ce470eb743bc0930d4f102619c57f64cb65
MD5 4f9faf49fb519cff93aec241e615d05a
BLAKE2b-256 c360ca4e57577cba3f90b15ad184fd5c532c3b205520bb052c2022ddc14745d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for psca-0.1.0-py3-none-any.whl:

Publisher: release.yml on YCRG-Labs/psca

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page