Skip to main content

Critical-thinking and analytics for human-AI conversations — a member of the lens analyser family.

Project description

conversation-analyser

Critical-thinking and analytics for human–AI conversations — a member of the analyser family.

It scores a single conversation on two tiers:

  1. Analytics (always on, offline): turn/word counts, prompt/response lengths, question ratio, pushback hits, readability, sentiment trajectory, prompt self-similarity, and temporal metrics when timestamps are present.
  2. Critical thinking (opt-in, needs an LLM): classifies every human turn under a 7-label prompt taxonomy, derives engagement ratios, an engagement band, and a composite 0–100 critical-thinking score with a component breakdown.

The taxonomy reuses the validated NQ/FU/CH/EX/DG/AC/MT scheme from the ISYS6020 marking pipeline (copied and forked). Design: docs/superpowers/specs/2026-05-23-conversation-analyser-design.md.

Install

pip install -e .                       # core: analytics + CLI + HTTP API
pip install -e '.[embeddings]'         # + prompt self-similarity (sentence-transformers)
pip install -e '.[llm]'                # + taxonomy/CT tier (anthropic)
pip install -e '.[embeddings,llm,dev]' # everything
export ANTHROPIC_API_KEY=...           # required for the critical-thinking tier

CLI

Bare positional path to analyse (human summary by default, --json for machines); serve subcommand for the HTTP API — same grammar as the rest of the family.

conversation-analyser transcript.txt              # human summary, analytics only
conversation-analyser chat.json --json            # full JSON to stdout
conversation-analyser chat.json --llm             # add the critical-thinking tier
conversation-analyser log.json --idle-gap 45      # split sub-sessions on 45-min gaps
conversation-analyser raw.txt --parse-mode llm-segment --llm
conversation-analyser serve --port 8009           # run the HTTP API

The critical-thinking tier is opt-in (--llm) to avoid surprise API costs; without it you get the analytics tier only.

HTTP API

conversation-analyser serve --port 8009
curl -F file=@chat.json 'http://127.0.0.1:8009/analyse'        # analytics only
curl -F file=@chat.json -F llm=true 'http://127.0.0.1:8009/analyse'
curl http://127.0.0.1:8009/health

GET /health and POST /analyse (multipart file upload, optional llm form field) — the same /analyse contract auto-analyser routes to.

Python API

from conversation_analyser import ConversationAnalyser

result = ConversationAnalyser().analyse("transcript.txt", llm=True)
print(result.model_dump_json(indent=2))

Input formats

A pluggable adapter registry tries, in order: structured adapters → heuristic speaker markers → optional LLM segmentation → unsegmented fallback.

  • role/content message list (OpenAI/Anthropic): [{"role": "user", "content": "..."}, ...]
  • AnythingLLM rows: [{"prompt": "...", "response": "...", "createdAt": ...}, ...]
  • flat text with speaker markers: User: / Assistant: / Me: / ChatGPT: / You said: / ChatGPT said: / Prompt: / Response:
  • anything else → LLM-segment (needs [llm]), else a single-blob fallback

.pdf/.docx inputs are text-extracted first (needs pdfplumber/markitdown, or pre-extract with document-analyser).

Long unstructured transcripts taking the LLM-segment path are split on paragraph boundaries into chunks (SEGMENT_CHUNK_CHARS, default 36 000) and classified chunk-by-chunk, so the whole transcript is labelled — not just its opening — and the band/ratios/score reflect all of it. The number of chunks (= LLM calls) is guarded by SEGMENT_MAX_CHUNKS (default 12); raise or lift it per run, e.g. CONVERSATION_ANALYSER_SEGMENT_MAX_CHUNKS=0 for unlimited. A capped run says so in notes rather than silently dropping the tail. Cleanly-labelled transcripts take the heuristic path and are never chunked.

The taxonomy

Code Meaning
NQ New Query — opens a new topic
FU Follow-up — clarification/elaboration
CH Challenge — pushes back, tests, asks why
EX Extension — applies/compares/synthesises in a new direction
DG Delegation — task hand-off, no engagement
AC Acknowledgement — thanks/confirmation
MT Meta — about the conversation itself

critical_thinking = (CH+EX)/turns, delegation = DG/turns, filler = (AC+MT)/turns. Bands: One-Shot · Delegator · Directed · Iterative · Critical.

Graceful degradation

Missing Effect
ANTHROPIC_API_KEY / [llm] taxonomy/critical_thinking null; analytics still produced; note llm_unavailable
[embeddings] prompt_self_similarity null; note embeddings_unavailable
timestamps temporal metrics omitted; no sub-session split; note no timestamps

Output

ConversationAnalysis → an aggregate (rolled up over all human turns, the headline) plus one SessionAnalysis per idle-gap sub-session, each with analytics, taxonomy, critical_thinking, and per-turn turns (label + rationale + preview). See the design spec §8 for the full schema.

Testing

pytest                    # fast, deterministic (LLM mocked, no network)
pytest -m slow            # includes sentence-transformers model download
pytest -m integration     # includes live LLM calls

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

conversation_analyser-0.4.0.tar.gz (192.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

conversation_analyser-0.4.0-py3-none-any.whl (30.8 kB view details)

Uploaded Python 3

File details

Details for the file conversation_analyser-0.4.0.tar.gz.

File metadata

  • Download URL: conversation_analyser-0.4.0.tar.gz
  • Upload date:
  • Size: 192.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for conversation_analyser-0.4.0.tar.gz
Algorithm Hash digest
SHA256 b322311ada2fcf52a243dfcdd5641a8cd93d64b59315bf72db797c2a108edfbd
MD5 8d60fdfd0c66e853e3e6d8057bcfd903
BLAKE2b-256 4ca444febe8ea6c688d95946307ac095ad967bd88a811131e1bc1be54429029f

See more details on using hashes here.

File details

Details for the file conversation_analyser-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for conversation_analyser-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 168fd0f99716fc9c3962d5e67b9c4a133f44180f27a8f224e0bba0a2f0faaec3
MD5 196bce67ed6a25a92a69f5d1ea161cfa
BLAKE2b-256 b52d8b3d83d16f6e79a889987dd10e3883ca3ea05b8dbbbb504eaeb255144ce7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page