Query Discourse and summarize threads
Project description
discuss-nutshell
Improve your understanding of long Discourse threads.
Problem statement
Discourse topics, such as on discuss.python.org, can get very long even over a few hours or days. These long threads makes it difficult to understand the conversation without spending one to three hours reading the thread. Discourse gives a predicted time to read the thread.
On discuss.python.org, discussion threads about an individual [Python Enhancement Proposal](https://peps.python.org], PEP, can get very long. To understand the pros and cons of the PEP, reading the thread is needed.
Motivation
I want a time-efficient way to read posts and summarize the key points. Ideally, I would like to understand the pros and cons of an individual PEPs. Understanding the authors' motivations and their background also is important.
Recapping the conversation in an accurate way would be very helpful.
Initial approach
Take a Discourse topic and parse it into posts that can be queried.
data_loader.py: Hit an endpoint and save to jsonpreprocessor.py: Do data cleaning and parsing into individual post fileslaunch_app.py: Launch gradio app to interact with the LLM and log queries, context, responses
Take the db file and use datasette to view: datasette data/posts_qa_logs.db
Summarize individual posts and aggregate the summarized posts into one posts file that can be queried.
Use a simple Gradio UI to interface with the user.
Next phases
Data to keep: Authors, date/time, post number, uuid post, core dev (bool), cooked message, summarized message
Possible prompts
- Does this message support or refute the proposed PEP?
- What are key topics found in the message
- How many times has a person posted
- You are a Python expert. Summarize this message.
- You are an intermediate Python user. Summarize this message.
- You are a manager not a developer. Summarize this message.
Report on pros and cons of the PEP proposal.
Query in 10 message chunks and summarize.
- Create a visual display of individual posts, summaries, author, and date posted.
- Display the summaries but allow the original post text to be accessed easily.
- Plot a sentiment of messages over time.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file discuss_nutshell-0.2.0.tar.gz.
File metadata
- Download URL: discuss_nutshell-0.2.0.tar.gz
- Upload date:
- Size: 980.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d6c9f6970c6615fecacdd31cc318d6ec3bb1281cd8b46bbb0fcedec35e34b8b
|
|
| MD5 |
cfa5e0debc84b80acd884b5d9b92afb4
|
|
| BLAKE2b-256 |
b656ee06a1db2fc0a3cb4c2b7348b8d7789734b80848c7b3da029ab9faeb3418
|
Provenance
The following attestation bundles were made for discuss_nutshell-0.2.0.tar.gz:
Publisher:
cd.yml on willingc/discuss-nutshell
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
discuss_nutshell-0.2.0.tar.gz -
Subject digest:
2d6c9f6970c6615fecacdd31cc318d6ec3bb1281cd8b46bbb0fcedec35e34b8b - Sigstore transparency entry: 729710831
- Sigstore integration time:
-
Permalink:
willingc/discuss-nutshell@b6149797950ba48026e382f6e0e38666529c2762 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/willingc
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@b6149797950ba48026e382f6e0e38666529c2762 -
Trigger Event:
release
-
Statement type:
File details
Details for the file discuss_nutshell-0.2.0-py3-none-any.whl.
File metadata
- Download URL: discuss_nutshell-0.2.0-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20f50b3a0801f5a766aa1e509975148a5d212572c4ec0ec9ee2a9fd9f6ec8924
|
|
| MD5 |
a6fc724fab928a788d7cc938eefc2a2b
|
|
| BLAKE2b-256 |
8bb5b9d1210f209807af9092610e0ee16a29f1174273870d01b12b2c7475413d
|
Provenance
The following attestation bundles were made for discuss_nutshell-0.2.0-py3-none-any.whl:
Publisher:
cd.yml on willingc/discuss-nutshell
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
discuss_nutshell-0.2.0-py3-none-any.whl -
Subject digest:
20f50b3a0801f5a766aa1e509975148a5d212572c4ec0ec9ee2a9fd9f6ec8924 - Sigstore transparency entry: 729710832
- Sigstore integration time:
-
Permalink:
willingc/discuss-nutshell@b6149797950ba48026e382f6e0e38666529c2762 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/willingc
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@b6149797950ba48026e382f6e0e38666529c2762 -
Trigger Event:
release
-
Statement type: