Fast extraction of job titles from strings
Find Job Titles in Strings
- Free software: MIT license
- Python versions: 2.7, 3.4+
- Find any of 77k job titles in a given string
- Text processing is extremely fast using “acora” library
- Dictionary generation takes about 20 seconds upfront
Instantiate “Finder” and start extracting job titles:
>>> from find_job_titles import Finder >>> finder.findall('I am the Senior Vice President') [('Senior Vice President', 9), ('Vice President', 16), ('President', 21)]
All possible, overlapping matches are returned. Matches contain positional information of where the match was found.
Alternatively use “finditer” for lazy consumption of matches:
>>> finder.finditer('I am the Senior Vice President')] <generator object ...>
- fixed tox tests for py27 re: different unicode treatment by acora and pyahocorasick
- only testing default Finder using pyahocorasick now.
- rewrote and fixed longest match code
- added pyahocorasick implementation and made default
- added params to enable/disable longest matches
- updated title list with marketing execs
- set non-dev version
- updated title list (- surnames, - blacklist, + added_roles)
- proper tracking of code with releases
- First release on PyPI.
Release history Release notifications
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size find_job_titles-0.7.0-py2.py3-none-any.whl (383.1 kB)||File type Wheel||Python version py2.py3||Upload date||Hashes View hashes|
|Filename, size find_job_titles-0.7.0.tar.gz (396.4 kB)||File type Source||Python version None||Upload date||Hashes View hashes|
Hashes for find_job_titles-0.7.0-py2.py3-none-any.whl