Skip to main content

Query Discourse and summarize threads

Project description

discuss-nutshell

Actions Status Documentation Status PyPI version Conda-Forge PyPI platforms GitHub Discussion

Improve your understanding of long Discourse threads.

Problem statement

Discourse topics, such as on discuss.python.org, can get very long even over a few hours or days. These long threads makes it difficult to understand the conversation without spending one to three hours reading the thread. Discourse gives a predicted time to read the thread.

On discuss.python.org, discussion threads about an individual [Python Enhancement Proposal](https://peps.python.org], PEP, can get very long. To understand the pros and cons of the PEP, reading the thread is needed.

Motivation

I want a time-efficient way to read posts and summarize the key points. Ideally, I would like to understand the pros and cons of an individual PEPs. Understanding the authors' motivations and their background also is important.

Recapping the conversation in an accurate way would be very helpful.

Initial approach

Take a Discourse topic and parse it into posts that can be queried.

  • data_loader.py: Hit an endpoint and save to json
  • preprocessor.py: Do data cleaning and parsing into individual post files
  • launch_app.py: Launch gradio app to interact with the LLM and log queries, context, responses

Take the db file and use datasette to view: datasette data/posts_qa_logs.db

Summarize individual posts and aggregate the summarized posts into one posts file that can be queried.

Use a simple Gradio UI to interface with the user.

Next phases

Data to keep: Authors, date/time, post number, uuid post, core dev (bool), cooked message, summarized message

Possible prompts

  • Does this message support or refute the proposed PEP?
  • What are key topics found in the message
  • How many times has a person posted
  • You are a Python expert. Summarize this message.
  • You are an intermediate Python user. Summarize this message.
  • You are a manager not a developer. Summarize this message.

Report on pros and cons of the PEP proposal.

Query in 10 message chunks and summarize.

  • Create a visual display of individual posts, summaries, author, and date posted.
  • Display the summaries but allow the original post text to be accessed easily.
  • Plot a sentiment of messages over time.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

discuss_nutshell-0.2.0.tar.gz (980.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

discuss_nutshell-0.2.0-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file discuss_nutshell-0.2.0.tar.gz.

File metadata

  • Download URL: discuss_nutshell-0.2.0.tar.gz
  • Upload date:
  • Size: 980.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for discuss_nutshell-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2d6c9f6970c6615fecacdd31cc318d6ec3bb1281cd8b46bbb0fcedec35e34b8b
MD5 cfa5e0debc84b80acd884b5d9b92afb4
BLAKE2b-256 b656ee06a1db2fc0a3cb4c2b7348b8d7789734b80848c7b3da029ab9faeb3418

See more details on using hashes here.

Provenance

The following attestation bundles were made for discuss_nutshell-0.2.0.tar.gz:

Publisher: cd.yml on willingc/discuss-nutshell

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file discuss_nutshell-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for discuss_nutshell-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 20f50b3a0801f5a766aa1e509975148a5d212572c4ec0ec9ee2a9fd9f6ec8924
MD5 a6fc724fab928a788d7cc938eefc2a2b
BLAKE2b-256 8bb5b9d1210f209807af9092610e0ee16a29f1174273870d01b12b2c7475413d

See more details on using hashes here.

Provenance

The following attestation bundles were made for discuss_nutshell-0.2.0-py3-none-any.whl:

Publisher: cd.yml on willingc/discuss-nutshell

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page