python package to parse the job bulletins and generate a structured file
Project description
PyCoLA : Python Module to parse the job bulletins and generate a structure file
Kaggle : Data Science for Good Challenge
https://www.kaggle.com/shivamb/1-bulletin-structuring-engine-cola
Installation
pip install pycola
Usage
Extractor class is used to generate the structured csv file. It accepts one user input config:
"input_path" : path of the bulletin text files
"output_filename" : name of the output file
from pycola.bulletin_parser import Extractor
## define the input path
config = {
"input_path" : "Bulletins/",
"output_filename" : "structured_file.csv"
}
## create the Extractor Class object
extr = Extractor(config)
## call the extraction function
extr.extraction()
Documentation
http://www.shivambansal.com/blog/network/cola/BulletinStructuringEngine.html
By Shivam Bansal
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pycola-0.1.60.tar.gz
(16.1 kB
view hashes)