Library to extract message quotations and signatures.
Project description
Claw, https://github.com/tictail/claw, is a library to extract message quotations and signatures. It is is a more light-weight version of the original https://github.com/mailgun/talon library.
If you ever tried to parse message quotations or signatures you know that absence of any formatting standards in this area could make this task a nightmare. Hopefully this library will make your life much easier.
Installation
pip install claw
Usage
Here’s how you initialize the library and extract a reply from a text message:
import claw
from claw import quotations
claw.init()
text = """Reply
-----Original Message-----
Quote"""
reply = quotations.extract_from(text, 'text/plain')
reply = quotations.extract_from_plain(text)
# reply == "Reply"
To extract a reply from html:
html = """Reply
<blockquote>
<div>
On 11-Apr-2011, at 6:54 PM, Bob <bob@example.com> wrote:
</div>
<div>
Quote
</div>
</blockquote>"""
reply = quotations.extract_from(html, 'text/html')
reply = quotations.extract_from_html(html)
# reply == "<html><body><p>Reply</p></body></html>"
Often the best way is the easiest one. Here’s how you can extract signature from email message without any machine learning fancy stuff:
from claw.signature import extract_signature
message = """Wow. Awesome!
--
Bob Smith"""
text, signature = extract_signature(message)
# text == "Wow. Awesome!"
# signature == "--\nBob Smith"
Quick and works like a charm 90% of the time. For other 10% you can use the power of machine learning algorithms. See the original talon implementation.
Development
virtualenv venv
source venv/bin/activate
make install
make test
Release new version:
Bump the version in setup.py and update CHANGELOG.md, and then:
make release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file claw-1.3.0.tar.gz
.
File metadata
- Download URL: claw-1.3.0.tar.gz
- Upload date:
- Size: 24.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f50322d5ce25e40374e84d3e694523dbc3241c47960da2159d2bde6aad3f699d |
|
MD5 | a6836a0a7191b707c23ba0ab51a27ec6 |
|
BLAKE2b-256 | c9c6e5c09ebd4ca2e8b410b54a2ce62099b7c6b898b16b54d26347d22020393a |