Skip to main content

Pseudonymize email content in Romance languages

Project description

mailcom

Tool to parse email body from email text (eml file), and retains only the text, with names removed, for French of Spanish emails.

Installation

Install using
python -m pip install mailcom

You will also need to download the French and Spanish models for spaCy and Stanza using the provided script - run this in the terminal:

./get-models.sh

For an overview over the available languages and models, check the spaCy website.

Usage

The package uses spaCy for sentencizing, based on the default language models, and transformers for NER recognition. Currently, you have to set the language and eml file directory manually at the top of parse.py; the default directory is data/in. Then run python parse.py. After the run, the output can be found in data/out.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mailcom-0.0.1.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

mailcom-0.0.1-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file mailcom-0.0.1.tar.gz.

File metadata

  • Download URL: mailcom-0.0.1.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for mailcom-0.0.1.tar.gz
Algorithm Hash digest
SHA256 98ee29553951d8568dec8d311f9384afc00a2a7511df6243dc3f6bb421781d88
MD5 b8cd3aeb397d7e24e3e7f46e9e3c0476
BLAKE2b-256 a7a1d1107ca1c4f44a1f6e7887cdc40a463439160cbca5d73a069f8b77ace4e2

See more details on using hashes here.

File details

Details for the file mailcom-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: mailcom-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for mailcom-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c241a16dee6006efe29dce404e708fdc970d1b4631932a11de3f9002a91107c2
MD5 678736bfa20e2f53247f66e3e5154d14
BLAKE2b-256 6c3981f6accfc420bf220e4a51b15d3fb57f79b2f27a7ae4ba4cc076085d0b72

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page