Python package for keyphrase labeling.
Project description
Kleis: Python package for keyphrase extraction
Kleis is a python package to label keyphrases in scientific text. It is named after the ancient greek word κλείς.
Install
Pip (Easy and quick)
$ pip install kleis-keyphrase-extraction
Make your own wheel
$ git clone https://github.com/sdhdez/kleis-keyphrase-extraction.git
$ cd kleis-keyphrase-extraction/
$ python setup.py sdist bdist_wheel
$ pip install dist/kleis_keyphrase_extraction-0.1.X.devX-py3-none-any.whl
Replace X with the corresponding values.
Note: This method doesn't include pre-trained models, you should download the corpus so it can train.
Usage
Example here
Datasets
Thepackage already includes some pre-trained models but if you want to test by your own you should download the datasets.
Download from SemEval 2017 Task 10 and decompress in "~/kleis_data/corpus/semeval2017-task10" or "./kleis_data/corpus/semeval2017-task10"
$ ls ~/kleis_data/corpus/semeval2017-task10
brat_config eval.py __MACOSX README_data.md scienceie2017_test_unlabelled train2 xml_utils.py
dev eval_py27.py README_data_dev.md README.md semeval_articles_test util.py zips
Test
You can test your installation with keyphrase-extraction-example.py
$ python keyphrase-extraction-example.py
Also, see here for another example.
Requirements
- Python 3 (Tested: 3.6.5)
- nltk (with corpus) (Tested: 3.2.5)
- python-crfsuite (Tested: 0.9.5)
Optional
Notebooks
To run the noteooks in this repository install JupyterLab.
$ pip install jupyterlab
Then run the following command.
jupyter lab
Further information
This method uses a CRFs model (Conditional Random Fields) to label keyphrases in text, the model is trained with keyphrase candidates filtered with Part-of-Spech tag sequences. It is based on the method described here, but with a better performance. Please, feel free to send us comments or questions.
In this version we use python-crfsuite.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for kleis-keyphrase-extraction-0.1.2.dev0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7514e2183513d21c232849035eba16557516ff34ee40cc403a328152ae0f6421 |
|
MD5 | bd1cdda53548ec5672f31f49b9cc3ee3 |
|
BLAKE2b-256 | b83686596b18dff33a6bf865cba35d17e50e92724595bfa7b45b78cbf843eb69 |
Hashes for kleis_keyphrase_extraction-0.1.2.dev0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ff964099ed19ec779e7fbf4f1d5490cc3864a4e39e6dac50ce25eb2f64a3e42 |
|
MD5 | 7f3cbefea98a82d382dae870e05870bc |
|
BLAKE2b-256 | cdc1dc8fd3bce4f62c4d9bc36f041725058356d39673fd8dbd8c3ba6b2e39569 |