Skip to main content

📧 Email reply parser library for Python with multi-language support

Project description

Mail Reply Parser 📧🐍

Python Version

Multi-language email reply parsing for international environments 🌍

Mail clients handle reply formatting differently, making reliable parsing difficult. Thank god we have standards. This library splits text-based emails into separate replies based on common headers produced by different, multilingual clients usually indicating separation.

Replies can either present the whole mail message body, or strip headers, signatures and common disclaimers if required. Currently supported languages are:

  • Danish (da) 🇩🇰
  • Dutch (nl) 🇳🇱
  • English (en) 🇬🇧
  • French (fr) 🇫🇷
  • German (de) 🇩🇪
  • Italian (it) 🇮🇹
  • Japanese (ja) 🇯🇵
  • Polish (pl) 🇵🇱
  • Swedish (sv) 🇸🇪
  • Czech (cs) 🇨🇿 (untested - contributions welcome!)
  • Spanish (es) 🇪🇸 (untested - contributions welcome!)
  • Korean (ko) 🇰🇷 (untested - contributions welcome!)
  • Chinese (zh) 🇨🇳 (untested - contributions welcome!)

🏳️‍🌈 Adding more languages is quite easy!

This is an improved Python implementation of GitHub's Ruby-based email_reply_parser and an adaptation of Zapier's email-reply-parser which both split the mails in fragments instead of distinct replies. They also only support English.

⭐ Features

⭐ Easy to implement
⭐ Multilanguage Support
⭐ Text-based mail parsing
⭐ Detect headers, signatures and disclaimers
⭐ Fully type annotated
⭐ Easy-to-read code and well-tested

Overview 🔭

This library makes it easy to split an incoming mail into replies, making working with emails much more manageable and easily providing the text content for each reply – with or without signatures, disclaimers and headers.

For example, it can turn the following email:

Awesome! I haven't had another problem with it.

Thanks,
alfonsrv

On Wed, Dec 20, 2023 at 13:37, RAUSYS <info@rausys.de> wrote:

> The good news is that I've found a much better query for lastLocation.
> It should run much faster now. Can you double-check?

Into just the replied text content:

Awesome! I haven't had another problem with it.

Get started 👾

Installation

pip install mail-parser-reply

Run tests

python -m unittest discover test

Parse Replies

from mailparser_reply import EmailReplyParser

mail_body = 'foobar'; languages = ['en', 'de']
mail_message = EmailReplyParser(languages=languages).read(text=mail_body)
print(mail_message.replies)

Or get only the latest reply using:

latest_reply = EmailReplyParser(languages=languages).parse_reply(text=mail_body)

Parser API

EmailMessage.text:              Mail body
EmailMessage.languages:         Languages to use for parsing headers
EmailMessage.replies:           List of EmailReply; single parsed replies
EmailMessage.include_english:   Always include English language for parsing
EmailMessage.keep_hyphen_lists: Don't capture hyphen marked lists as signatures
EmailMessage.default_language:  Default language to use if language dictionary
                                doesn't include any other language codes

EmailMessage.HEADER_REGEX:      RegEx for identifying headers, separating mails
EmailMessage.SIGNATURE_REGEX:   RegEx for identifying signatures
EmailMessage.DISCLAIMERS_REGEX: RegEx for identifying disclaimers

EmailMessage.read():             Parse EmailMessage.text to EmailReply which
                                 are then stored in EmailMessage.replies
EmailReply.content:              Unprocessed mail body with headers, signatures, disclaimers
EmailReply.body:                 Mail body without headers, signatures, disclaimers
EmailReply.full_body:            Mail body; just without headers

EmailReply.headers:              Identified Headers
EmailReply.signatures:           Identified Signatures
EmailReply.disclaimers:          Identified disclaimers

Buy me a Coffee

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mail_parser_reply-1.36.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mail_parser_reply-1.36-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file mail_parser_reply-1.36.tar.gz.

File metadata

  • Download URL: mail_parser_reply-1.36.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.3

File hashes

Hashes for mail_parser_reply-1.36.tar.gz
Algorithm Hash digest
SHA256 7f451c583c6c599bc933531a6f783fe4d1a26447ebd40382678417dbb92e100e
MD5 431838e51c2dcda2725071e6f8b641bc
BLAKE2b-256 9d8bfc0eaa365f9509e3761beb933d5964eec3f133225c3d081290a511395464

See more details on using hashes here.

File details

Details for the file mail_parser_reply-1.36-py3-none-any.whl.

File metadata

File hashes

Hashes for mail_parser_reply-1.36-py3-none-any.whl
Algorithm Hash digest
SHA256 81395a4d8a0858509c875e6bbb7b004d53d70c7cd92204fe34a9926c55e4ef03
MD5 bed1ef79ccc85c34766d89b5889191ea
BLAKE2b-256 0f5d99bd9fb9556c54a66860ead493b9d9d77cc8d7a2c52aa76d995c50b73373

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page