Skip to main content

A Python library to parse email files

Project description

Coverage Status CLA assistant CircleCI

parse-emails

Parses an email message file and extracts the data from it.

The key features are:

  • Supports .eml and .msg files.
  • Extracts nested attachments.

Requirements

Python 3.8.5+

Installation

$ pip install parse-emails

Usage

The main class ParseEmails contains all what you need to parse an email:

import parse_emails

email = parse_emails.ParseEmails(file_path=<file_path>, max_depth=3, parse_only_headers=False)
email.parse_emails()
print(email.parsed_email['Subject'])

Inputs

Argument Name Description
file_path* the file_path of a in msg or eml format
parse_only_headers Will parse only the headers and return headers table, Default is False
max_depth How many levels deep we should parse the attached emails (e.g. email contains an emails contains an email). Default depth level is 3. Minimum level is 1, if set to 1 the script will parse only the first level email
file_info the file info

Outputs


Path Description
To This shows to whom the message was addressed, but may not contain the recipient's address.
CC Email 'cc' addresses
From This displays who the message is from, however, this can be easily forged and can be the least reliable.
Subject Email subject
HTML Email 'html' body if exists
Text Email 'text' body if exists
Depth The depth of the email. Depth=0 for the first level email. If email1 contains email2 contains email3. Then email1 depth is 0, email2 depth is 1, email3 depth is 2
HeadersMap The full email headers json
HeadersMap.From This displays who the message is from, however, this can be easily forged and can be the least reliable.
HeadersMap.To This shows to whom the message was addressed, but may not contain the recipient's address.
HeadersMap.Subject Email subject
HeadersMap.Date The date and time the email message was composed
HeadersMap.CC Email 'cc' addresses
HeadersMap.Reply-To The email address for return mail
HeadersMap.Received List of all the servers/computers through which the message traveled
HeadersMap.Message-ID A unique string assigned by the mail system when the message is first created. These can easily be forged. (e.g. 5c530c1b.1c69fb81.bd826.0eff@mx.google.com)
AttachmentNames The list of attachment names in the email
Format The format of the email if available

Contributing

Contributions are welcome and appreciated. To contribute you can submit a PR. We suggest contancting us before submitting a PR to discuss your intentions and plans.

Before merging any PRs, we need all contributors to sign a contributor license agreement. By signing a contributor license agreement, we ensure that the community is free to use your contributions.

When you open a new pull request, a bot will evaluate whether you have signed the CLA. If required, the bot will comment on the pull request, including a link to accept the agreement. The CLA document is also available for review as a PDF.

If the license/cla status check remains on Pending, even though all contributors have accepted the CLA, you can recheck the CLA status by visiting the following link (replace [PRID] with the ID of your PR): https://cla-assistant.io/check/demisto/email-parser?pullRequest=[PRID] .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parse-emails-0.0.1.tar.gz (7.3 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page