It a simple package for training and classification of resumes.
Project description
Resume Classification
Objective
Aim of this project is to train a set of resumes of specific domain and create a machine learning model to predict the unseen resumes.
Currently the model is trained using logistic regression on these four domain:
- Java
- Cloud
- Big Data
- Machine Learning
Resumes are read using a package called tika which supports many file formats including the following popular ones:
- doc
- docx
- pdf
NOTE: To know more about tika visit the following link: https://pypi.org/project/tika/
Installation
pip install resume_classification
Dependencies
- numpy==1.17.3
- pandas==0.25.1
- tika==1.24
- nltk==3.4.5
Python version
`Python 3.7.4`
Project Guidelines
-
Train
In order to train a new set of resumes, the project ought to have a defined folder structure given below:
Then run the following commands:
from resume_classification import train
train(path-to-resumes-folder)
Output
The output will consist of the following metrics
- Model Accuracy
- F1 Score
- Confusion Matrix
-
Predict
Use the following command:
from resume_classification import predict
predict(path-to-resumes)
NOTE for predict module, the path to resume will contain all the unseen resumes in a single folder.
Output
The output will be a dataframe consisting of
file name
andpredicted domain
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file resume_classification-1.0.tar.gz
.
File metadata
- Download URL: resume_classification-1.0.tar.gz
- Upload date:
- Size: 17.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fa22e33dad4648a4ef43fd6e03fca5c9a9cbe573cc0628799825e187369fa79 |
|
MD5 | 9f4dd96ae2dba3c2ab6d066f4ae784ba |
|
BLAKE2b-256 | c09d589bcca9eabbe0424d30e6297c799cf79b2da8fce2bddd79939f7634d905 |
File details
Details for the file resume_classification-1.0-py3-none-any.whl
.
File metadata
- Download URL: resume_classification-1.0-py3-none-any.whl
- Upload date:
- Size: 992.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2dee7ac558839f016f68726cbc24ca29df426ae6611587b933092ac32c605973 |
|
MD5 | 17ae440821739ff820bebfb297a2680a |
|
BLAKE2b-256 | 13683341f411a8d26eb3a08b81232f4ac7e6f4e34d65cfb7f884fd28ce47033d |