python package to parse the job bulletins and generate a structured file
Project description
PyCoLA : Python Module to parse the job bulletins and generate a structure file
Kaggle : Data Science for Good Challenge
https://www.kaggle.com/shivamb/1-bulletin-structuring-engine-cola
Installation
pip install pycola
Usage
Extractor class is used to generate the structured csv file. It accepts one user input config:
"input_path" : path of the bulletin text files
"output_filename" : name of the output file
from pycola.bulletin_parser import Extractor
## define the input path
config = {
"input_path" : "Bulletins/",
"output_filename" : "structured_file.csv"
}
## create the Extractor Class object
extr = Extractor(config)
## call the extraction function
extr.extraction()
Documentation
http://www.shivambansal.com/blog/network/cola/BulletinStructuringEngine.html
By Shivam Bansal
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pycola-0.1.60.tar.gz
(16.1 kB
view details)
File details
Details for the file pycola-0.1.60.tar.gz.
File metadata
- Download URL: pycola-0.1.60.tar.gz
- Upload date:
- Size: 16.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.22.0 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.7.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1e5727429a4f65136105b601083e1b95dec9a3c1ad7fb6989de90438ce38c39
|
|
| MD5 |
009f19ae9c30ba0fb8b7a033693bde52
|
|
| BLAKE2b-256 |
2e39e5f09168f67345068c4d7c2224d9b90de8bdbf12bfc98d6317d0e634e499
|