Skip to main content

It a simple package for training and classification of resumes.

Project description

Resume Classification

Objective

Aim of this project is to train a set of resumes of specific domain and create a machine learning model to predict the unseen resumes.

Currently the model is trained using logistic regression on these four domain:

  • Java
  • Cloud
  • Big Data
  • Machine Learning

Resumes are read using a package called tika which supports many file formats including the following popular ones:

- doc
- docx
- pdf

NOTE: To know more about tika visit the following link: https://pypi.org/project/tika/

Installation

pip install resume_classification

Dependencies

  • numpy==1.17.3
  • pandas==0.25.1
  • tika==1.24
  • nltk==3.4.5

Python version

`Python 3.7.4`

Project Guidelines

  • Train

    In order to train a new set of resumes, the project ought to have a defined folder structure given below:

    Then run the following commands:

    from resume_classification import train

    train(path-to-resumes-folder)

    Output

    The output will consist of the following metrics

    • Model Accuracy
    • F1 Score
    • Confusion Matrix
  • Predict

    Use the following command:

    from resume_classification import predict

    predict(path-to-resumes)

    NOTE for predict module, the path to resume will contain all the unseen resumes in a single folder.

    Output

    The output will be a dataframe consisting of file name and predicted domain.

Project details


Release history Release notifications | RSS feed

This version

1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

resume_classification-1.0.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

resume_classification-1.0-py3-none-any.whl (992.1 kB view details)

Uploaded Python 3

File details

Details for the file resume_classification-1.0.tar.gz.

File metadata

  • Download URL: resume_classification-1.0.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for resume_classification-1.0.tar.gz
Algorithm Hash digest
SHA256 5fa22e33dad4648a4ef43fd6e03fca5c9a9cbe573cc0628799825e187369fa79
MD5 9f4dd96ae2dba3c2ab6d066f4ae784ba
BLAKE2b-256 c09d589bcca9eabbe0424d30e6297c799cf79b2da8fce2bddd79939f7634d905

See more details on using hashes here.

File details

Details for the file resume_classification-1.0-py3-none-any.whl.

File metadata

  • Download URL: resume_classification-1.0-py3-none-any.whl
  • Upload date:
  • Size: 992.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for resume_classification-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2dee7ac558839f016f68726cbc24ca29df426ae6611587b933092ac32c605973
MD5 17ae440821739ff820bebfb297a2680a
BLAKE2b-256 13683341f411a8d26eb3a08b81232f4ac7e6f4e34d65cfb7f884fd28ce47033d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page