Tool to analyze a mbox mail dump
Project description
Mbox mail analysis
This is a script that analyzes a mbox mail export, such as the one provided by Google Takeout from a Gmail box, and produces a report on the content.
Current reports
- received mails over hour of the day and day of the week
- mail per day over time
- most active addresses
Usage
You need an export of your mailbox in mbox format (for Gmail you can get it from Google Takeout).
Install the tool using pip:
python3 -m pip install mailbox-report-generator
Then run this command:
generate_mbox_report "/path/to/the/mbox/file.mbox"
a report will be created in the form of an HTML file, and opened with your default browser.
Extending the report
The report is generated by running every message through a series of Processor
s.
Each Processor implements its own logic to aggregate relevant details and can output the report as an HTML string. They are independent from each other.
These strings are simply concatenated to generate a static HTML file, two processors output the header and footer of this file.
This structure makes it quite easy to add or remove specific analysis, run automated tests and implement caching.
Possible future improvements
- Examine the mail lenght and word usage over time
- Note that extracting text from mails is very hard, the multipart format and the weird formats used by advertisement e-mails make it an extremely unreliable operation.
- Examine the textual content of the emails with SpaCy, retrieve Named Entities like people and locations (see note on previous point)
- Find which languages are used in the e-mail body
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mailbox-report-generator-0.1.3.tar.gz
.
File metadata
- Download URL: mailbox-report-generator-0.1.3.tar.gz
- Upload date:
- Size: 8.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.4.2 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ccaf4c65588f70d0a002bdb5233ad8fc1b71eea4f98b4a59e5ac549ef4d6fc70 |
|
MD5 | 45b8830a2be9317a5f57ba72791f9173 |
|
BLAKE2b-256 | 6dd7248e3951d8f87a98bf1ac496a09841617f2c132c4183307e1705122f977a |
File details
Details for the file mailbox_report_generator-0.1.3-py3-none-any.whl
.
File metadata
- Download URL: mailbox_report_generator-0.1.3-py3-none-any.whl
- Upload date:
- Size: 9.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.4.2 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 118b3984f7841db40fcbe3a86e999041846cc6c6d763de960b9085827075709a |
|
MD5 | 64ab123b35508f1852cf61e666809d17 |
|
BLAKE2b-256 | f8e0e29d5bbac6bb990d570b4e586ce14b3062e937dc446e808f206aec92eae4 |