Skip to main content

Static review of online survey instruments for resistance to AI/bot respondents

Project description

๐Ÿ›ก๏ธ Survey Shield

PyPI version Python versions Tests License: MIT

Static review of online survey instruments for resistance to AI/bot respondents โ€” plus an optional live runtime that drives a real browser through your survey.

What is Survey Shield?

Survey Shield gives researchers feedback on whether their survey instrument is hardened against AI respondents. Two paths:

  1. Instrument Review (primary, static, no browser) โ€” point it at a Qualtrics .qsf export. Multi-agent LLM reviewers fan out across the 8-category bot-resistance rubric (Logic, Visual Reasoning, Traps, Open Ends, Mouse and Keyboard Input, Behavioral, Context Awareness, ECLAIR โ€” sourced from the Polarization Research Lab's Daneel framework), produce a peer-review-style verdict with verbatim-grounded findings, and render a self-contained HTML report with a copy-paste Methods statement and APA/BibTeX citation.
  2. Take Survey (live runtime) (optional [live] extra) โ€” drives a real browser (browser-use) through a live Qualtrics URL, reports detected mechanisms after the fact. Costs ~5โ€“10 minutes and real LLM credit per run, so the hosted demo doesn't expose it; install the [live] extra to run it locally.

Researchers using Survey Shield mostly want Instrument Review. Reach for the live runtime when you need to exercise the survey end-to-end.

Install

pip install surveyshield-py            # static review only โ€” small, no browser
pip install "surveyshield-py[live]"    # adds browser-use + Playwright
playwright install chromium            # only if you installed [live]

The PyPI name is surveyshield-py; the import name is surveyshield.

Set an LLM provider key in your environment (or in a .env file in the working directory โ€” Survey Shield loads it via python-dotenv):

OPENAI_API_KEY=sk-...        # used unless the model name starts with "gemini"
GOOGLE_API_KEY=...           # used for Gemini models

CLI

surveyshield review your_survey.qsf
# โ†’ writes your_survey.report.html next to the input

surveyshield review your_survey.qsf --output report.html --json review.json \
                                    --model gpt-4o-mini

surveyshield take https://qualtrics.com/jfe/form/SV_xxx     # requires [live]
                  --model gemini-3-flash-preview --max-steps 150

surveyshield serve --host 127.0.0.1 --port 8000
# โ†’ boots the FastAPI app + bundled React SPA

surveyshield --help lists every command and flag.

Python API

import asyncio
import surveyshield

review, _parsed = asyncio.run(
    surveyshield.review_qsf(
        "your_survey.qsf",
        model="gpt-4o-mini",
        # api_key="sk-...",      # or rely on env vars
        # categories=["content-traps", "eclaire"],   # default = all 8
    )
)

print(review.overall_score, review.overall_feedback.headline)

with open("report.html", "w") as f:
    f.write(surveyshield.render_html(review))

The review object is a surveyshield.InstrumentReview Pydantic model. Power users can compose the lower-level seams directly: parse_qsf, run_review, drop_unverified_quotes, consolidate_and_summarize, aggregate. See surveyshield/__init__.py for the public surface.

For live runtime:

import asyncio, surveyshield  # surveyshield-py[live] installed

result = asyncio.run(surveyshield.take_survey(
    "https://qualtrics.com/jfe/form/SV_xxx",
    model="gemini-3-flash-preview",
    max_steps=150,
))
print(result.defense_score, result.bot_completion_likelihood)
print(result.overall_feedback.headline)
for c in result.categories:
    print(f"{c.category.value}: {c.score:.0f}/100  ({len(c.findings)} findings)")

If the [live] extra isn't installed, calling surveyshield.take_survey(...) raises ImportError and the CLI's take command exits with a clear install hint.

Self-host the hosted UI

git clone https://github.com/kiante-fernandez/survey-shield
cd survey-shield
./setup.sh                              # creates .conda env + writes .env stub
echo "OPENAI_API_KEY=sk-..." >> .env    # or GOOGLE_API_KEY
surveyshield serve --reload             # โ†’ http://localhost:8000

The Take Survey tab is gated on live_take_enabled โ€” GET /api/v1/survey/config flips it to true once a key is detected in the env.

Endpoints

Models

The hosted UI does not expose a model picker โ€” Instrument Review reviewers run on a sensible default (gpt-4o-mini). Self-hosters who want a different model can pass model_name directly to the API or CLI. The backend has no allowlist; any model name langchain-openai's ChatOpenAI or langchain-google-genai's ChatGoogleGenerativeAI accept will be routed by prefix:

  • Names starting with gemini โ†’ Google (requires GOOGLE_API_KEY)
  • Everything else โ†’ OpenAI (requires OPENAI_API_KEY)

API usage (self-host)

Instrument Review (primary)

# Submit a QSF for review
curl -F "file=@your_survey.qsf" \
     http://localhost:8000/api/v1/instrument/review
# โ†’ {"review_id": "<uuid>", "status": "queued", ...}

# Poll
curl http://localhost:8000/api/v1/instrument/status/<uuid>
# queued โ†’ running โ†’ completed (~30โ€“90 s)

# Structured JSON
curl http://localhost:8000/api/v1/instrument/results/<uuid>

# Human-readable HTML report
curl "http://localhost:8000/api/v1/instrument/report/<uuid>"

# Download as a file
curl -OJ "http://localhost:8000/api/v1/instrument/report/<uuid>?download=1"

Live runtime (self-host only)

curl -X POST "http://localhost:8000/api/v1/survey/analyze" \
     -H "Content-Type: application/json" \
     -d '{
       "survey_url": "https://example.com/survey",
       "model_name": "gpt-4o-mini",
       "max_steps": 150,
       "use_vision": true
     }'
# Then poll /api/v1/survey/status/<id> and fetch /api/v1/survey/results/<id>.

What Survey Shield evaluates

The 8-category rubric

Both tools share the same eight bot-resistance categories โ€” sourced from the Polarization Research Lab's Daneel framework and informed by Westwood (2025, PNAS) and Affonso (2026, JCR). The shared registry lives at surveyshield/categories.py; each tool owns its own prompt templates that ask a different question of the same evidence:

  • Instrument Review asks "is this category's defense implemented in the QSF?" โ€” signal is question text + per-question embedded_html (where authors wire keystroke-tracking JS, reCAPTCHA widgets, weather-API checks).
  • Live runtime asks "did this category's defense work during the actual browser-use run?" โ€” signal is the agent's structured per-step transcript.
Category What it tests
Logic CRT items, Sally-Anne theory-of-mind, syllogisms, impossible-event probes
Visual Reasoning Image-based illusions, counting elements, perspective tasks, layout tricks
Traps Explicit IMCs, human-attestation oaths, invisible-text instructions, honeypots
Open Ends Knowledge-gap probes ("first paragraph of the Constitution"), reverse-shibboleths
Mouse and Keyboard Input Map clicks, drag-and-drop, keystroke-timing tracking, click patterns
Behavioral reCAPTCHA v3, IAT latencies, total-survey-time gating, mouse trajectory
Context Awareness "Is it raining where you are?" with weather/zipcode/time verification
ECLAIR Refusal probes โ€” questions safety-tuned LLMs refuse but humans answer freely

Adding a new category is one entry in categories.py plus matching prompt templates in review/dimensions.py and live/category_prompts.py.

Findings are grounded: every reviewer finding cites a verbatim excerpt โ€” a question text/choice for static reviews, a step-N quote for live runs. Excerpts that can't be located in the source are dropped before consolidation.

The product is scoped strictly to bot resistance. We do not critique a survey's substantive research design, theoretical framing, or question wording โ€” those remain the researcher's domain.

Development

Project structure

survey-shield/
โ”œโ”€โ”€ surveyshield/                # the importable package
โ”‚   โ”œโ”€โ”€ __init__.py              # public API (review_qsf, render_html, take_survey, โ€ฆ)
โ”‚   โ”œโ”€โ”€ cli.py                   # Typer CLI (review / take / serve)
โ”‚   โ”œโ”€โ”€ models/                  # Pydantic schemas
โ”‚   โ”œโ”€โ”€ categories.py            # shared 8-Category registry (both tools)
โ”‚   โ”œโ”€โ”€ review/                  # static review pipeline
โ”‚   โ”‚   โ”œโ”€โ”€ parser.py            #   QSF โ†’ ParsedSurvey
โ”‚   โ”‚   โ”œโ”€โ”€ dimensions.py        #   per-category static prompts
โ”‚   โ”‚   โ”œโ”€โ”€ reviewer.py          #   fan-out, verify, consolidate, aggregate
โ”‚   โ”‚   โ””โ”€โ”€ templates/           #   Jinja2 self-contained HTML report
โ”‚   โ”œโ”€โ”€ live/                    # browser-use runtime ([live] extra)
โ”‚   โ”‚   โ”œโ”€โ”€ analyzer.py          #   SurveyAnalyzer / take_survey
โ”‚   โ”‚   โ”œโ”€โ”€ evidence.py          #   AgentHistoryList โ†’ LiveRunEvidence
โ”‚   โ”‚   โ”œโ”€โ”€ reviewer.py          #   per-category fan-out, aggregate, render_live_html
โ”‚   โ”‚   โ”œโ”€โ”€ category_prompts.py  #   per-category live prompts
โ”‚   โ”‚   โ”œโ”€โ”€ prompts.py
โ”‚   โ”‚   โ””โ”€โ”€ patches.py
โ”‚   โ”œโ”€โ”€ templates/               # shared CSS + Jinja2 macros (both reports)
โ”‚   โ””โ”€โ”€ serve/                   # FastAPI app + bundled React SPA
โ”‚       โ”œโ”€โ”€ app.py
โ”‚       โ”œโ”€โ”€ _jobs.py             # shared status/results/report endpoint factory
โ”‚       โ”œโ”€โ”€ config.py
โ”‚       โ”œโ”€โ”€ api/{survey,instrument}.py
โ”‚       โ””โ”€โ”€ static/              # built React (populated by bin/build.sh)
โ”œโ”€โ”€ frontend/                    # React/TypeScript source (CRA)
โ”œโ”€โ”€ tests/                       # pytest suite + tiny QSF fixture
โ”œโ”€โ”€ pyproject.toml               # canonical package metadata
โ”œโ”€โ”€ Procfile                     # web: gunicorn surveyshield.serve.app:app
โ”œโ”€โ”€ bin/build.sh                 # React build โ†’ surveyshield/serve/static/
โ””โ”€โ”€ .github/workflows/           # test.yml + release.yml (PyPI trusted publishing)

Local dev

./setup.sh                                   # one-time: conda env at ../.conda
pip install -e ".[dev,live]"                 # editable install + tests + browser-use
pytest -q                                    # ~60 tests, no LLM calls
cd frontend && npx tsc --noEmit && npm run build

Releasing

git tag v0.2.0 && git push --tags
# .github/workflows/release.yml builds the wheel + sdist (with the React SPA
# bundled into surveyshield/serve/static/) and publishes to PyPI via OIDC.

The PyPI project must be configured with this repo + release.yml as a Trusted Publisher before the first push.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make changes with tests
  4. Submit a pull request

License

MIT License โ€” see LICENSE.

Citation

If you use Survey Shield in published work:

@misc{fernandez2026surveyshield,
  author = {Fernandez, K. and Low, A. and Bogard, J. and Fox, C. R.},
  title  = {Survey Shield: Static review of online survey instruments for resistance to non-human responses},
  year   = {2026},
  note   = {Manuscript in preparation},
}

Every report includes the same citation pre-formatted (APA + BibTeX).

Disclaimer

Survey Shield is intended for research and testing purposes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

surveyshield_py-0.2.1.tar.gz (884.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

surveyshield_py-0.2.1-py3-none-any.whl (891.0 kB view details)

Uploaded Python 3

File details

Details for the file surveyshield_py-0.2.1.tar.gz.

File metadata

  • Download URL: surveyshield_py-0.2.1.tar.gz
  • Upload date:
  • Size: 884.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for surveyshield_py-0.2.1.tar.gz
Algorithm Hash digest
SHA256 f4b69bd5dbaa0ec968851206152f9dfe2614d2d384df11fdfb9dd8c35dfceea9
MD5 2956cd67465c53de17a8a68d2597f7a0
BLAKE2b-256 97657e073d077372a6cafe097bb54c9c684b0b4383daac411c4e71b870d29975

See more details on using hashes here.

Provenance

The following attestation bundles were made for surveyshield_py-0.2.1.tar.gz:

Publisher: release.yml on kiante-fernandez/survey-shield

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file surveyshield_py-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: surveyshield_py-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 891.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for surveyshield_py-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 60ab71f074362919510d1d728249ae30a1d0bd0ca011c832d853eef074297e02
MD5 3e6ee9ce301f6e158ab253cf6c40bf86
BLAKE2b-256 052f587b14ca7579c0febae625e0d49197d215ac1e48dad30fba1fe2f20d01f9

See more details on using hashes here.

Provenance

The following attestation bundles were made for surveyshield_py-0.2.1-py3-none-any.whl:

Publisher: release.yml on kiante-fernandez/survey-shield

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page