Skip to main content

A package for parsing resume and extracting entities.

Project description

PyResumeParser

PyResumeParser is a Python package designed to parse resume PDF files and extract key entities such as names, emails, phone numbers, education details, skills, and more. It utilizes spaCy and pdfminer.six for natural language processing and PDF text extraction. It requires Python version 3.10 or higher.

Installation

You can install PyResumeParser using pip:

pip install pyresumeparser

Usage

As a Python Module

To use PyResumeParser in your Python code, you can import the package and call the parse_resume function:

import pyresumeparser

pdf_file = "resume.pdf"
parsed_resume = pyresumeparser.parse_resume(pdf_file)
print(parsed_resume)

From the Terminal

You can also use PyResumeParser directly from the terminal:

pyresumeparser resume.pdf

This command will parse the specified PDF file and print the extracted entities in JSON format.

Example Output

Here is an example of the JSON output you might get from parsing a resume:

{
  "first_name": ["John"],
  "last_name": ["Doe"],
  "email": ["johndoe@example.com"],
  "phone": ["+1 234 567 890"],
  "country": ["USA"],
  "state": ["California"],
  "city": ["San Francisco"],
  "pincode": ["94107"],
  "college_name": ["University of Example"],
  "education": ["BSc Computer Science"],
  "designation": ["Software Engineer"],
  "position_held": ["Lead Developer"],
  "companies_worked": ["Tech Company Inc."],
  "projects_worked": ["Project A", "Project B"],
  "skills": ["Python", "Machine Learning", "Data Analysis"],
  "total_experience": ["5 years"],
  "language": ["English"],
  "linkedin": ["https://linkedin.com/in/johndoe"],
  "github": ["https://github.com/johndoe"]
}

Requirements

Python version required: 3.10 or higher.

The following packages are required to use PyResumeParser (required packages are automatically installed during the installation of the package pyresumeparser):

  • spacy==3.7.4
  • pdfminer.six==20231228
  • spacy-transformers==1.3.5
  • tqdm==4.66.4

You can install these packages manually using pip:

pip install -r requirements.txt

Contributing

Contributions are welcome! Please feel free to submit a Pull Request or open an issue on GitHub.

License

This project is licensed under the MIT License.

Author

Developed by Palash Khan. Feel free to reach out with any questions or feedback.

Happy parsing!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyresumeparser-0.0.9.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyresumeparser-0.0.9-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file pyresumeparser-0.0.9.tar.gz.

File metadata

  • Download URL: pyresumeparser-0.0.9.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.11

File hashes

Hashes for pyresumeparser-0.0.9.tar.gz
Algorithm Hash digest
SHA256 66807f2048858be2441d353eec2eada860c21cc10cc58a24d19b037a286c5ba4
MD5 e5dc5156943b48bea3dba148a33f1030
BLAKE2b-256 b97b520196cc3c7c1a4e6dd8929e20a19489a83fb469234a55a127dd511877b9

See more details on using hashes here.

File details

Details for the file pyresumeparser-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: pyresumeparser-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.11

File hashes

Hashes for pyresumeparser-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 616c1e867448ca301f842c04f32eebff3a7d1847769c846c34eed1dd929acecb
MD5 76f7caa533c3af5f51e6c88e356fe21f
BLAKE2b-256 a1cf7133d056ed966793c4b9d4ebbed1b8602cfc26da2a2858986e752ef8fbcf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page