Skip to main content

Fast extraction of job titles from strings

Project description

find_job_titles

https://img.shields.io/pypi/v/find_job_titles.svg https://img.shields.io/pypi/pyversions/find_job_titles.svg https://img.shields.io/travis/fluquid/find_job_titles.svg Coverage Status

Find Job Titles in Strings

  • Free software: MIT license

  • Python versions: 2.7, 3.4+

Features

  • Find any of 77k job titles in a given string

  • Text processing is extremely fast using “acora” library

  • Dictionary generation takes about 20 seconds upfront

Quickstart

Instantiate “Finder” and start extracting job titles:

>>> from find_job_titles import Finder
>>> finder.findall('I am the Senior Vice President')
[('Senior Vice President', 9),
 ('Vice President', 16),
 ('President', 21)]

All possible, overlapping matches are returned. Matches contain positional information of where the match was found.

Alternatively use “finditer” for lazy consumption of matches:

>>> finder.finditer('I am the Senior Vice President')]
<generator object ...>

Credits

This package was created with Cookiecutter and the fluquid/cookiecutter-pypackage project template.

History

0.7.0 (2017-08-22)

  • fixed tox tests for py27 re: different unicode treatment by acora and pyahocorasick

  • only testing default Finder using pyahocorasick now.

0.6.0 (2017-08-22)

  • rewrote and fixed longest match code

  • added pyahocorasick implementation and made default

  • added params to enable/disable longest matches

0.5.0 (2017-08-22)

0.4.0 (2017-08-21)

  • updated title list with marketing execs

  • set non-dev version

0.3.0-dev (2017-08-18)

  • updated title list (- surnames, - blacklist, + added_roles)

0.2.0-dev (2017-08-18)

  • proper tracking of code with releases

0.1.0 (unreleased)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

find_job_titles-0.7.0.tar.gz (396.4 kB view details)

Uploaded Source

Built Distribution

find_job_titles-0.7.0-py2.py3-none-any.whl (383.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file find_job_titles-0.7.0.tar.gz.

File metadata

File hashes

Hashes for find_job_titles-0.7.0.tar.gz
Algorithm Hash digest
SHA256 88763ef7e1f47ced03bda7e61c4cf778ef4f39cd71d4b59b226d7b49bf7e7aad
MD5 84cfb2f037de12a858a00cb6004fd717
BLAKE2b-256 dd79961b1af12d2d57cdc2d2d4bb0206dcdb1fbce9032e18d7a0b530afa72efb

See more details on using hashes here.

File details

Details for the file find_job_titles-0.7.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for find_job_titles-0.7.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 4ad27d617834cc0c1630d3e5cf09b0df62c5913f69ddc5a682797d7a331e7c40
MD5 2bb9fea9a1415f0f616fc0096e5f3156
BLAKE2b-256 e3439f8294dabf906f3cc5277a0914a4dcc7fb6d506c3e8c317e469c11dbeea7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page