Skip to main content

Tool to analyze a mbox mail dump

Project description

Mbox mail analysis

test status test status

This is a script that analyzes an mbox mail export, such as the one provided by Google Takeout from a Gmail box, and produces a report on the content.

an heatmap representation of number of mail per day and hour an interactive chart showing the mail activity per day

Current reports

  • received mails over hour of the day and day of the week
  • mail per day over time
  • most active addresses

Usage

You need an export of your mailbox in mbox format (for Gmail you can get it from Google Takeout).

Install the tool using pip:

python3 -m pip install mailbox-report-generator

Then run this command:

generate_mbox_report "/path/to/the/mbox/file.mbox"

a report will be created in the form of an HTML file, and opened with your default browser.

Extending the report

The report is generated by running every message through a series of Processors. Each Processor implements its own logic to aggregate relevant details and can output the report as an HTML string.

These strings are simply concatenated to generate a static HTML file, two processors output the header and footer of this file.

This structure makes it quite easy to add or remove specific analysis, run automated tests and implement caching.

Possible future improvements

  • Examine the mail lenght and word usage over time
    • Note that extracting text from mails is very hard, the multipart format and the weird formats used by advertise e-mails make it an extremely unreliable operation.
  • Examine the textual content of the emails with SpaCy, retrieve Named Entities like people and locations (see note on previous point)
  • Find which languages are used in the mail body

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mailbox-report-generator-0.1.1.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

mailbox_report_generator-0.1.1-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file mailbox-report-generator-0.1.1.tar.gz.

File metadata

File hashes

Hashes for mailbox-report-generator-0.1.1.tar.gz
Algorithm Hash digest
SHA256 67f9546000605d77377082a1240549620e588962d0e4a96edb665a95d1d0c756
MD5 ce88497479f728717a5f4305585b83ec
BLAKE2b-256 8c9769c771e71a94227e5c8c35e4bc3ef65fe98ec444dfd8f905a8daaf14480f

See more details on using hashes here.

File details

Details for the file mailbox_report_generator-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for mailbox_report_generator-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 21ae7fbadce595e43ff01dfc5ffa150d6ed77c43476a381a11c7afbf3a500245
MD5 b5cc454011073eb7880fd98c82cca2b8
BLAKE2b-256 0e44c886c0db2e47c61df1f42136dbf886bb8e8351879d28e4d5abfdd3f13680

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page