Fast extraction of job titles from strings
Find Job Titles in Strings
Free software: MIT license
Python versions: 2.7, 3.4+
Find any of 77k job titles in a given string
Text processing is extremely fast using “acora” library
Dictionary generation takes about 20 seconds upfront
Instantiate “Finder” and start extracting job titles:
>>> from find_job_titles import Finder >>> finder.findall('I am the Senior Vice President') [('Senior Vice President', 9), ('Vice President', 16), ('President', 21)]
All possible, overlapping matches are returned. Matches contain positional information of where the match was found.
Alternatively use “finditer” for lazy consumption of matches:
>>> finder.finditer('I am the Senior Vice President')] <generator object ...>
fixed tox tests for py27 re: different unicode treatment by acora and pyahocorasick
only testing default Finder using pyahocorasick now.
rewrote and fixed longest match code
added pyahocorasick implementation and made default
added params to enable/disable longest matches
updated title list with marketing execs
set non-dev version
updated title list (- surnames, - blacklist, + added_roles)
proper tracking of code with releases
First release on PyPI.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Hashes for find_job_titles-0.7.0-py2.py3-none-any.whl