Skip to main content

Critical-thinking and analytics for human-AI conversations — a member of the lens analyser family.

Project description

conversation-analyser

Critical-thinking and analytics for human–AI conversations — a member of the analyser family.

It scores a single conversation on two tiers:

  1. Analytics (always on, offline): turn/word counts, prompt/response lengths, question ratio, pushback hits, readability, sentiment trajectory, prompt self-similarity, and temporal metrics when timestamps are present.
  2. Critical thinking (opt-in, needs an LLM): classifies every human turn under a 7-label prompt taxonomy, derives engagement ratios, an engagement band, and a composite 0–100 critical-thinking score with a component breakdown.

The taxonomy reuses the validated NQ/FU/CH/EX/DG/AC/MT scheme from the ISYS6020 marking pipeline (copied and forked). Design: docs/superpowers/specs/2026-05-23-conversation-analyser-design.md.

Install

pip install -e .                       # core: analytics + CLI + HTTP API
pip install -e '.[embeddings]'         # + prompt self-similarity (sentence-transformers)
pip install -e '.[llm]'                # + taxonomy/CT tier (anthropic)
pip install -e '.[embeddings,llm,dev]' # everything
export ANTHROPIC_API_KEY=...           # required for the critical-thinking tier

CLI

Bare positional path to analyse (human summary by default, --json for machines); serve subcommand for the HTTP API — same grammar as the rest of the family.

conversation-analyser transcript.txt              # human summary, analytics only
conversation-analyser chat.json --json            # full JSON to stdout
conversation-analyser chat.json --llm             # add the critical-thinking tier
conversation-analyser log.json --idle-gap 45      # split sub-sessions on 45-min gaps
conversation-analyser raw.txt --parse-mode llm-segment --llm
conversation-analyser serve --port 8009           # run the HTTP API

The critical-thinking tier is opt-in (--llm) to avoid surprise API costs; without it you get the analytics tier only.

HTTP API

conversation-analyser serve --port 8009
curl -F file=@chat.json 'http://127.0.0.1:8009/analyse'        # analytics only
curl -F file=@chat.json -F llm=true 'http://127.0.0.1:8009/analyse'
curl http://127.0.0.1:8009/health

GET /health and POST /analyse (multipart file upload, optional llm form field) — the same /analyse contract auto-analyser routes to.

Python API

from conversation_analyser import ConversationAnalyser

result = ConversationAnalyser().analyse("transcript.txt", llm=True)
print(result.model_dump_json(indent=2))

Input formats

A pluggable adapter registry tries, in order: structured adapters → heuristic speaker markers → optional LLM segmentation → unsegmented fallback.

  • role/content message list (OpenAI/Anthropic): [{"role": "user", "content": "..."}, ...]
  • AnythingLLM rows: [{"prompt": "...", "response": "...", "createdAt": ...}, ...]
  • flat text with speaker markers: User: / Assistant: / Me: / ChatGPT: / You said: / ChatGPT said: / Prompt: / Response:
  • anything else → LLM-segment (needs [llm]), else a single-blob fallback

.pdf/.docx inputs are text-extracted first (needs pdfplumber/markitdown, or pre-extract with document-analyser).

Long unstructured transcripts taking the LLM-segment path are split on paragraph boundaries into chunks (SEGMENT_CHUNK_CHARS, default 36 000) and classified chunk-by-chunk, so the whole transcript is labelled — not just its opening — and the band/ratios/score reflect all of it. The number of chunks (= LLM calls) is guarded by SEGMENT_MAX_CHUNKS (default 12); raise or lift it per run, e.g. CONVERSATION_ANALYSER_SEGMENT_MAX_CHUNKS=0 for unlimited. A capped run says so in notes rather than silently dropping the tail. Cleanly-labelled transcripts take the heuristic path and are never chunked.

The taxonomy

Code Meaning
NQ New Query — opens a new topic
FU Follow-up — clarification/elaboration
CH Challenge — pushes back, tests, asks why
EX Extension — applies/compares/synthesises in a new direction
DG Delegation — task hand-off, no engagement
AC Acknowledgement — thanks/confirmation
MT Meta — about the conversation itself

critical_thinking = (CH+EX)/turns, delegation = DG/turns, filler = (AC+MT)/turns. Bands: One-Shot · Delegator · Directed · Iterative · Critical.

Graceful degradation

Missing Effect
ANTHROPIC_API_KEY / [llm] taxonomy/critical_thinking null; analytics still produced; note llm_unavailable
[embeddings] prompt_self_similarity null; note embeddings_unavailable
timestamps temporal metrics omitted; no sub-session split; note no timestamps

Output

ConversationAnalysis → an aggregate (rolled up over all human turns, the headline) plus one SessionAnalysis per idle-gap sub-session, each with analytics, taxonomy, critical_thinking, and per-turn turns (label + rationale + preview). See the design spec §8 for the full schema.

Testing

pytest                    # fast, deterministic (LLM mocked, no network)
pytest -m slow            # includes sentence-transformers model download
pytest -m integration     # includes live LLM calls

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

conversation_analyser-0.3.0.tar.gz (138.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

conversation_analyser-0.3.0-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file conversation_analyser-0.3.0.tar.gz.

File metadata

  • Download URL: conversation_analyser-0.3.0.tar.gz
  • Upload date:
  • Size: 138.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for conversation_analyser-0.3.0.tar.gz
Algorithm Hash digest
SHA256 b315f9324bbae097e82bb3114501583a93176a67b41bc215ad3374379b1db22a
MD5 8c0bf053c5dc32f83b11386027a718de
BLAKE2b-256 3f585e426c1c5f6e2638709465ab7ffe160e73aacfead542913996c0d62f36c3

See more details on using hashes here.

File details

Details for the file conversation_analyser-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for conversation_analyser-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2d1527c05503a22a729c31466cdb4b7d7424b73988e856d28485d869c3566147
MD5 3feada7a473f8b641f346b03f6689b80
BLAKE2b-256 87ffb92194adb7baf79514c8dec78c9dd0ba068dc06225f00dc29e39a320cf85

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page