Library to extract message quotations and signatures.
Project description
Claw, https://github.com/tictail/claw, is a library to extract message quotations and signatures. It is is a more light-weight version of the original https://github.com/mailgun/talon library.
If you ever tried to parse message quotations or signatures you know that absence of any formatting standards in this area could make this task a nightmare. Hopefully this library will make your life much easier.
Installation
pip install claw
Usage
Here’s how you initialize the library and extract a reply from a text message:
import claw from claw import quotations claw.init() text = """Reply -----Original Message----- Quote""" reply = quotations.extract_from(text, 'text/plain') reply = quotations.extract_from_plain(text) # reply == "Reply"
To extract a reply from html:
html = """Reply <blockquote> <div> On 11-Apr-2011, at 6:54 PM, Bob <bob@example.com> wrote: </div> <div> Quote </div> </blockquote>""" reply = quotations.extract_from(html, 'text/html') reply = quotations.extract_from_html(html) # reply == "<html><body><p>Reply</p></body></html>"
Often the best way is the easiest one. Here’s how you can extract signature from email message without any machine learning fancy stuff:
from claw.signature import extract_signature message = """Wow. Awesome! -- Bob Smith""" text, signature = extract_signature(message) # text == "Wow. Awesome!" # signature == "--\nBob Smith"
Quick and works like a charm 90% of the time. For other 10% you can use the power of machine learning algorithms. See the original talon implementation.
Development
virtualenv venv source venv/bin/activate make install make test
Release new version:
make release
Project details
Release history Release notifications
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size claw-1.1.0.tar.gz (24.8 kB) | File type Source | Python version None | Upload date | Hashes View hashes |