A package for parsing resume and extracting entities.
Project description
PyResumeParser
PyResumeParser is a Python package designed to parse resume PDF files and extract key entities such as names, emails, phone numbers, education details, skills, and more. It utilizes spaCy and pdfminer.six for natural language processing and PDF text extraction. It requires Python version 3.10 or higher.
Installation
You can install PyResumeParser using pip:
pip install pyresumeparser
Usage
As a Python Module
To use PyResumeParser in your Python code, you can import the package and call the parse_resume function:
import pyresumeparser
pdf_file = "resume.pdf"
parsed_resume = pyresumeparser.parse_resume(pdf_file)
print(parsed_resume)
From the Terminal
You can also use PyResumeParser directly from the terminal:
pyresumeparser resume.pdf
This command will parse the specified PDF file and print the extracted entities in JSON format.
Example Output
Here is an example of the JSON output you might get from parsing a resume:
{
"first_name": ["John"],
"last_name": ["Doe"],
"email": ["johndoe@example.com"],
"phone": ["+1 234 567 890"],
"country": ["USA"],
"state": ["California"],
"city": ["San Francisco"],
"pincode": ["94107"],
"college_name": ["University of Example"],
"education": ["BSc Computer Science"],
"designation": ["Software Engineer"],
"position_held": ["Lead Developer"],
"companies_worked": ["Tech Company Inc."],
"projects_worked": ["Project A", "Project B"],
"skills": ["Python", "Machine Learning", "Data Analysis"],
"total_experience": ["5 years"],
"language": ["English"],
"linkedin": ["https://linkedin.com/in/johndoe"],
"github": ["https://github.com/johndoe"]
}
Requirements
Python version required: 3.10 or higher.
The following packages are required to use PyResumeParser (required packages are automatically installed during the installation of the package pyresumeparser):
- spacy==3.7.4
- pdfminer.six==20231228
- spacy-transformers==1.3.5
- tqdm==4.66.4
You can install these packages manually using pip:
pip install -r requirements.txt
Contributing
Contributions are welcome! Please feel free to submit a Pull Request or open an issue on GitHub.
License
This project is licensed under the MIT License.
Author
Developed by Palash Khan. Feel free to reach out with any questions or feedback.
Happy parsing!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyresumeparser-0.0.9.tar.gz.
File metadata
- Download URL: pyresumeparser-0.0.9.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
66807f2048858be2441d353eec2eada860c21cc10cc58a24d19b037a286c5ba4
|
|
| MD5 |
e5dc5156943b48bea3dba148a33f1030
|
|
| BLAKE2b-256 |
b97b520196cc3c7c1a4e6dd8929e20a19489a83fb469234a55a127dd511877b9
|
File details
Details for the file pyresumeparser-0.0.9-py3-none-any.whl.
File metadata
- Download URL: pyresumeparser-0.0.9-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
616c1e867448ca301f842c04f32eebff3a7d1847769c846c34eed1dd929acecb
|
|
| MD5 |
76f7caa533c3af5f51e6c88e356fe21f
|
|
| BLAKE2b-256 |
a1cf7133d056ed966793c4b9d4ebbed1b8602cfc26da2a2858986e752ef8fbcf
|