Skip to main content

LLM-assisted qualitative survey coding with auditable Excel workbooks

Project description

Qualitative Response LLM Analyst

Tools for coding qualitative survey responses using a hybrid of human review and multi-model LLM assistance.

LLMs via OpenRouter.

The workflow supports:

  • Discovering and maintaining themes and theme groups
  • Assigning up to five themes per response (configurable)
  • Flagging candidate themes for human approval
  • Producing auditable coded datasets in Excel workbooks
  • Summarising model assignments for human-in-the-loop review

All behaviour follows the specification in the project overview.

Repository structure

.
├── docs/
│   ├── agents/          Agent and contributor guidance
│   └── spec/            Workbook schema, workflow, constraints
├── prompts/             Authoring notes for LLM prompt templates
├── src/qrla/            Python package (`qrla` CLI)
│   ├── prompts/         Bundled .txt templates loaded at runtime
│   └── templates/       Bundled canonical Excel workbook template
├── .env.example         Environment variable template
└── README.md

Survey workbooks live under data/ (gitignored). Create a new workbook from the bundled template (see below).

Requirements

Installation

From PyPI:

pip install qualitative-response-llm-analyst

For local development:

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -e ".[dev]"

Configuration

Copy .env.example to .env and set your API key:

cp .env.example .env
# Edit .env — at minimum set OPENROUTER_API_KEY

Never commit .env files.

Workbook model

Each question is one Excel workbook with these sheets:

Sheet Purpose
question Survey question text, context, and metadata
theme_groups High-level conceptual clusters
theme_catalog Individual themes and definitions
responses_coded One row per response; model outputs and human finals
model_runs Model metadata and parameters for audit

Create a workbook from the bundled template:

qrla init-template data/my-survey-Q01-themes.xlsx

Keep sheet names and column headers unchanged. Full schema details: schemas.

CLI commands

The qrla command implements the end-to-end workflow:

Command Stage Description
qrla init-template Copy the bundled canonical workbook template
qrla validate Check a workbook against the template
qrla discover 1 Propose themes from response samples
qrla review-themes 1.5 Validate, extend, or retire existing themes
qrla assign 2 Assign themes to each response
qrla review-assignments 2.5 Create a tidy review sheet for filtering
qrla summarize 3 Per-model summary sheets (optional charts)
qrla export 4 Not implemented — flatten workbook to CSV/Parquet

Quick start

# Create and validate a new workbook
qrla init-template data/my-survey-Q01-themes.xlsx
qrla validate data/my-survey-Q01-themes.xlsx

# Stage 1 — discover themes from responses
qrla discover data/my-survey-Q01-themes.xlsx \
  --question-id SURVEY_2025_Q01 \
  --model openai/gpt-5.5 \
  --max-themes 30 \
  -v

# Stage 2 — assign themes (run once per model for multi-model comparison)
qrla assign data/my-survey-Q01-themes.discovered.openai_gpt_5_5.xlsx \
  --question-id SURVEY_2025_Q01 \
  --model openai/gpt-5.4-mini \
  -v

# Stage 3 — summarise assignments for human review
qrla summarize data/my-survey-Q01-themes.discovered.openai_gpt_5_5.xlsx \
  --question-id SURVEY_2025_Q01 \
  --chart auto \
  -v

Common options:

  • --question-id — must match a row in the question sheet
  • --model — OpenRouter model id (e.g. openai/gpt-5.5, anthropic/claude-sonnet-4.6)
  • --max-themes — cap on themes discovered or assigned per response
  • --context-column — optional column on the question sheet with domain-specific coding guidance
  • -v / -vv — progress stats; -vv also prints prompts and raw LLM output

See the workflow spec for the full stage-by-stage process.

Model guidance

Typical choices (via OpenRouter — pass the model id to --model):

  • Discovery / theme review: capable frontier models — e.g. openai/gpt-5.5, anthropic/claude-sonnet-4.6, google/gemini-3.1-pro-preview, x-ai/grok-4.3, mistralai/mistral-large-2512
  • Assignment: faster, cheaper models — e.g. openai/gpt-5.4-mini, anthropic/claude-haiku-4.5, google/gemini-3.5-flash, x-ai/grok-build-0.1, mistralai/mistral-small-2603

Run several assignment models and compare results before final human coding.

Theme status values

Status Meaning
candidate Proposed by LLM; needs review
candidate-add Suggested new theme
candidate-retire Suggested retirement
active Approved; used in assignment
retired Historical; excluded from prompts

Documentation

Contributing

  1. Fork and branch (e.g. feature/stage2-improvements)
  2. Follow PEP 8; use type hints and docstrings
  3. Keep README.md and docs/ in sync when changing behaviour
  4. Do not commit survey data or API keys
  5. Run tests locally with pytest before opening a PR (CI runs the same checks on GitHub Actions)

Releasing

Production releases use PyPI trusted publishing. TestPyPI uses the same mechanism for dry runs before the first real upload.

One-time setup

  1. Create an account at test.pypi.org (separate from production PyPI).
  2. Enable 2FA on both TestPyPI and production PyPI.
  3. Add a trusted publisher on each site (Account settings → Publishing, or project settings after the first upload):
    • TestPyPI: workflow test-release.yml, environment testpypi (optional but recommended)
    • Production PyPI: workflow release.yml, environment pypi (optional but recommended)
  4. In GitHub: Settings → Environments — create testpypi and pypi if you want approval gates before publish.

Dry run on TestPyPI

Use this before tagging a production release.

  1. Ensure version in pyproject.toml is the version you want to test (TestPyPI allows re-upload only if you bump the version or delete the file).
  2. Push your branch to GitHub.
  3. Open Actions → Test Release → Run workflow and start the run.
  4. Install from TestPyPI (dependencies still come from production PyPI):
python -m venv .venv-test
source .venv-test/bin/activate   # Windows: .venv-test\Scripts\activate
pip install -i https://test.pypi.org/simple/ \
  --extra-index-url https://pypi.org/simple/ \
  qualitative-response-llm-analyst
  1. Smoke-test the install:
qrla --help
qrla init-template /tmp/test-workbook.xlsx
qrla validate /tmp/test-workbook.xlsx
  1. When satisfied, tag and push for production:
git tag v0.1.0
git push origin v0.1.0

That triggers Release (release.yml) and publishes to pypi.org.

Install from production PyPI:

pip install qualitative-response-llm-analyst

Note: TestPyPI is ephemeral — do not rely on packages staying there long term. It exists to validate packaging and install before the real release.

License

BSD 2-Clause License — see LICENSE.

Maintained by Martin Foster and collaborators.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qualitative_response_llm_analyst-0.1.0.tar.gz (48.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file qualitative_response_llm_analyst-0.1.0.tar.gz.

File metadata

File hashes

Hashes for qualitative_response_llm_analyst-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0ff5b37bc1d6490791cbc487a3b0dfca35aad2129b8a532a8d4584b022f5ca9b
MD5 4381a23b1302d83cbbf3fe0d15e50648
BLAKE2b-256 7d09658af992e9fc3dae4c874eadc9010cabd8fa354763daf75725228dcf2c5d

See more details on using hashes here.

Provenance

The following attestation bundles were made for qualitative_response_llm_analyst-0.1.0.tar.gz:

Publisher: release.yml on martinffoster/qualitative-response-llm-analyst

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file qualitative_response_llm_analyst-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for qualitative_response_llm_analyst-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5abd636b406c91e6192e36d0cd5b9d0a61a85bc27f5bf6bf1f98fe596948e93f
MD5 0fdbb6c9d16416bb66b84d14056e69b4
BLAKE2b-256 0cad33607a14c694c63c7634be6db3c2f26a3ab6ad2a140738b57a74524641f9

See more details on using hashes here.

Provenance

The following attestation bundles were made for qualitative_response_llm_analyst-0.1.0-py3-none-any.whl:

Publisher: release.yml on martinffoster/qualitative-response-llm-analyst

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page