Evidence-first discussion intelligence tooling.

Project description

ThreadSense

Discussion intelligence, not scraping. ThreadSense is a reproducible pipeline that turns public discussion threads into structured, evidence-backed product and research intelligence.

Why ThreadSense

Scrapers give you raw payloads. AI summarizers give you prose. Neither gives you a defensible basis for decisions.

ThreadSense keeps the evidence chain intact at every stage:

Acquire — source connectors fetch threads from Reddit, Hacker News, and GitHub Discussions
Normalize — source-specific payloads are mapped into a canonical thread model with provenance metadata
Analyze — deterministic extraction identifies issues, feature requests, themes, and sentiment — each linked to the comment that produced it
Synthesize — optional local-model inference adds summaries on top of the deterministic evidence layer
Report — structured outputs in Markdown, HTML, or JSON, with full traceability from finding back to source comment

Every stage produces a persisted, inspectable artifact. Rerun any stage independently. Diff results across runs. Audit exactly where a finding came from.

What This Enables

Single-thread analysis

Feed a discussion URL. Get structured findings — issues, requests, themes, severity — with every claim linked to the comment that produced it.

uv run threadsense run reddit \
  "https://www.reddit.com/r/ClaudeCode/comments/1ro0qbl/anyone_actually_built_a_second_brain_that_isnt/" \
  --format markdown \
  --with-summary \
  --summary-required

Cross-thread research

Search a topic across multiple subreddits. ThreadSense deterministically selects, ranks, and analyzes matching threads, then synthesizes a corpus-level report.

uv run threadsense --output-format human research reddit \
  --query "second brain OR agentic PKM" \
  --subreddit ClaudeCode \
  --subreddit LocalLLaMA \
  --subreddit AI_Agents \
  --limit 5 \
  --per-subreddit-limit 3 \
  --with-summary

Domain-aware analysis

The analysis layer uses a contract system with domain-specific vocabularies (developer tools, product feedback, hiring, research, financial markets, gaming). Each domain defines its own theme keywords, issue markers, and severity calibration — so analysis adapts to context rather than applying one-size-fits-all heuristics.

Architecture

fetch → normalize → analyze → [optional inference] → report
         ↓              ↓              ↓                ↓
     raw artifact   canonical     analysis         report artifact
                    artifact      artifact

Deterministic core — parsing, normalization, scoring, and selection are reproducible across runs
Inference on top — LLM synthesis is optional and layered over deterministic evidence, never a substitute for it
Stable artifacts — each stage persists a separate JSON artifact with schema_version and SHA256 provenance
Fail fast — invalid URLs, malformed payloads, and schema inconsistencies surface immediately

Sources and Discovery

Capability	Reddit	Hacker News	GitHub Discussions
Thread analysis	yes	yes	yes
Topic research	yes	—	—

Output Modes

Mode	Purpose
`json`	Machine-readable payloads for downstream tooling
`human`	Rich terminal panels and summaries for operators
`quiet`	Status-only output for scripts and CI

uv run threadsense --output-format human research reddit ...

See docs/output-modes.md for details.

Who This Is For

Product teams validating pain points and feature demand from community discussions
Founders doing market and competitor research across technical communities
DevRel teams tracking developer workflow friction and tooling sentiment
Researchers studying technical communities with reproducible methodology

Quickstart

# 1. Install
uv sync

# 2. Validate local setup
uv run threadsense preflight

# 3. Analyze a single thread
uv run threadsense run reddit \
  "https://www.reddit.com/r/ClaudeCode/comments/1ro0qbl/anyone_actually_built_a_second_brain_that_isnt/"

# 4. Research a topic across subreddits
uv run threadsense research reddit \
  --query "second brain OR agentic PKM" \
  --subreddit ClaudeCode \
  --subreddit LocalLLaMA \
  --subreddit AI_Agents

CLI Commands

Command	Purpose
`run`	End-to-end single-thread pipeline
`research reddit`	Cross-subreddit topic research and corpus synthesis
`fetch`	Acquire raw thread data
`normalize`	Map raw data to canonical model
`analyze`	Deterministic evidence extraction
`infer`	LLM-assisted synthesis
`report`	Generate output reports
`corpus`	Build and analyze cross-thread corpora
`inspect`	Examine persisted artifacts
`batch run`	Process multiple threads
`preflight`	Validate local environment
`serve`	Local API server

Full command reference: docs/usage.md

Artifact Storage

Every pipeline run produces inspectable artifacts under .threadsense/:

.threadsense/
├── raw/<source>/          # Source payloads as fetched
├── normalized/<source>/   # Canonical thread model
├── analysis/<source>/     # Evidence-linked findings
├── reports/<source>/      # Rendered reports
├── corpora/<corpus-id>/   # Manifest, analysis, and report
└── batches/               # Batch run metadata

Details: docs/artifacts.md

Local Runtime

ThreadSense runs without a local model for deterministic analysis. Summaries and synthesis require a local OpenAI-compatible endpoint (default: http://127.0.0.1:8080/v1/chat/completions).

Details: docs/local-runtime-contract.md

Documentation

Document	Content
usage.md	Command reference
research-reddit.md	Reddit topic research workflow
output-modes.md	JSON, human, and quiet output modes
artifacts.md	Artifact types and storage layout
overview.md	Product and workflow overview
system-design.md	Architecture and system boundaries
local-runtime-contract.md	Local inference contract
pitch.md	Product positioning

Validation

uv run ruff check
uv run ruff format --check .
uv run mypy --strict src tests
uv run pytest

Current Limits

Topic research is implemented for Reddit; other source discovery workflows are planned
Reddit research queries support OR/| clause unions only (intentionally narrow for deterministic alignment)
Corpus reports are Markdown only
The local API is a trusted local surface, not a hardened public service

Direction

Richer corpus presentation and operator workflows
Discovery workflows beyond Reddit
Evaluation and replay benchmarking
Source-distribution and research-quality reporting

Project details

Release history Release notifications | RSS feed

0.3.0

Apr 7, 2026

This version

0.2.1

Apr 7, 2026

0.2.0

Apr 6, 2026

0.1.0a0 pre-release

Apr 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

threadsense-0.2.1.tar.gz (176.4 kB view details)

Uploaded Apr 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

threadsense-0.2.1-py3-none-any.whl (126.9 kB view details)

Uploaded Apr 7, 2026 Python 3

File details

Details for the file threadsense-0.2.1.tar.gz.

File metadata

Download URL: threadsense-0.2.1.tar.gz
Upload date: Apr 7, 2026
Size: 176.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.14

File hashes

Hashes for threadsense-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`14497e21437d233e5103928b853c620a207c6f35c64ce280669a946f7818dc4a`
MD5	`c4b28420a90b8320a7e808cc2e957803`
BLAKE2b-256	`b97c9bf434cfda4772f43b1fa7f85d6618ff2e6216f8688e469cb84b93573ae5`

See more details on using hashes here.

File details

Details for the file threadsense-0.2.1-py3-none-any.whl.

File metadata

Download URL: threadsense-0.2.1-py3-none-any.whl
Upload date: Apr 7, 2026
Size: 126.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.14

File hashes

Hashes for threadsense-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cdf931ee4c5375f29f41a91b77ea762c62f435d9a38ad1ea2dafc4c5367181e8`
MD5	`a91a55ee7fad1944a9c78ffd00672b52`
BLAKE2b-256	`fe38d1bec73b9e5403d36c91b2cf9d5f989f4ef0b71d2faf71a6f600b000c931`

See more details on using hashes here.

threadsense 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

ThreadSense

Why ThreadSense

What This Enables

Single-thread analysis

Cross-thread research

Domain-aware analysis

Architecture

Sources and Discovery

Output Modes

Who This Is For

Quickstart

CLI Commands

Artifact Storage

Local Runtime

Documentation

Validation

Current Limits

Direction

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes