Skip to main content

Tool for extracting emails from pdf and docx files. (Designed especially for resumes)

Project description

PyEmailExtractor Downloads

PyEmailExtractor is a tool designed specifically for extracting emails from PDF and DOCX files, with a focus on resumes. It provides a convenient way to extract email addresses from these document formats, which can be useful for various applications, such as recruitment, data analysis, or contact management.

Features

  • Extract email addresses from PDF files.
  • Extract email addresses from DOCX files.
  • Designed specifically for resumes, ensuring accurate email extraction.
  • Simple and easy-to-use.

Installation

You can install PyEmailExtractor using pip:

pip install PyEmailExtractor

Example Usage

from PyEmailExtractor import extract_emails

dir = "/home/username/Downloads/resumes"

list_emails = extract_emails(dir)

print(list_emails)
  • The above example will print the parsed emails in a list.

Requirements

PyEmailExtractor requires the following dependencies:

  • docx2txt==0.8
  • lxml==4.9.2
  • PyPDF2==3.0.1
  • python-docx==0.8.11
  • PyMuPDF==1.22.5
  • pytesseract

Contributing

Contributions are welcome! If you encounter any issues or have suggestions for improvements, please open an issue on the GitHub repository. License

PyEmailExtractor is released under the MIT License. See the LICENSE file for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyEmailExtractor-0.2.1.tar.gz (2.9 kB view details)

Uploaded Source

File details

Details for the file PyEmailExtractor-0.2.1.tar.gz.

File metadata

  • Download URL: PyEmailExtractor-0.2.1.tar.gz
  • Upload date:
  • Size: 2.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for PyEmailExtractor-0.2.1.tar.gz
Algorithm Hash digest
SHA256 836e45a0041adec62143ea33449f70ff1889abf1fb1cfdbdff4d6a022942bede
MD5 a26a94a25137db56884a9a09374874a8
BLAKE2b-256 b293d8ef97080ebd0d5620e3ef925702674f1498e5cb790fc4d0aae80695913d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page