Python package for keyphrase labeling.
Project description
Kleis: Python package for keyphrase extraction
Kleis is a python package to label keyphrases in scientific text. It is named after the ancient greek word κλείς.
Install
Pip
$ pip install kleis-keyphrase-extraction
Make your own wheel
$ git clone https://github.com/sdhdez/kleis-keyphrase-extraction.git
$ cd kleis-keyphrase-extraction/
$ python setup.py sdist bdist_wheel
$ pip install dist/kleis_keyphrase_extraction-0.1.X.devX-py3-none-any.whl
Replace X with the corresponding values.
Note: This method doesn't include pre-trained models, you should
Usage
Example here
Datasets
Thepackage already includes some pre-trained models but if you want to test by your own you should download the datasets.
Download from SemEval 2017 Task 10 and decompress in "~/kleis_data/corpus/semeval2017-task10" or "./kleis_data/corpus/semeval2017-task10"
$ ls ~/kleis_data/corpus/semeval2017-task10
brat_config eval.py __MACOSX README_data.md scienceie2017_test_unlabelled train2 xml_utils.py
dev eval_py27.py README_data_dev.md README.md semeval_articles_test util.py zips
Test
You can test your installation with keyphrase-extraction-example.py
$ python keyphrase-extraction-example.py
Also, see here for another example.
Requirements
- Python 3 (Tested: 3.6.5)
- nltk (with corpus) (Tested: 3.2.5)
- python-crfsuite (Tested: 0.9.5)
Optional
Notebooks
To run the noteooks in this repository install JupyterLab.
$ pip install jupyterlab
Then run the following command.
jupyter lab
Further information
This method uses a CRFs model (Conditional Random Fields) to label keyphrases in text, the model is trained with keyphrase candidates filtered with Part-of-Spech tag sequences. It is based on the method described here, but with a better performance. Please, feel free to send us comments or questions.
In this version we use python-crfsuite.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for kleis-keyphrase-extraction-0.1.1.dev0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 852996ee55467e9c57e855cd938f72a1c392e965110b5273ceb957f62ccf676a |
|
MD5 | da02516edeb49ee08535d201241b315b |
|
BLAKE2b-256 | 314bc02a36aaa6caffbeb363c4da1cf035d26f71f469855be048f5fc8620af5b |
Hashes for kleis_keyphrase_extraction-0.1.1.dev0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 91299cc0608aeac6ec9136f376840ac2482c0e3cf653e30eab8a221238d4fb4c |
|
MD5 | 1da343711eba1f591750757f78415c10 |
|
BLAKE2b-256 | 68c7bbfc914bf580d83bc88b9a49cd757c1e3b78d4e27723cd028da26081afc1 |