Python package for keyphrase labeling.
Project description
Kleis: Python package for keyphrase extraction
Kleis is a python package to label keyphrases in scientific text. It is named after the ancient greek word κλείς.
Install
Pip (Easy and quick)
$ pip install kleis-keyphrase-extraction
Make your own wheel
$ git clone https://github.com/sdhdez/kleis-keyphrase-extraction.git
$ cd kleis-keyphrase-extraction/
$ python setup.py sdist bdist_wheel
$ pip install dist/kleis_keyphrase_extraction-0.1.X.devX-py3-none-any.whl
Replace X with the corresponding values.
Note: This method doesn't include pre-trained models, you should download the corpus so it can train.
Usage
Example here
Datasets
Thepackage already includes some pre-trained models but if you want to test by your own you should download the datasets.
Download from SemEval 2017 Task 10 and decompress in "~/kleis_data/corpus/semeval2017-task10" or "./kleis_data/corpus/semeval2017-task10"
$ ls ~/kleis_data/corpus/semeval2017-task10
brat_config eval.py __MACOSX README_data.md scienceie2017_test_unlabelled train2 xml_utils.py
dev eval_py27.py README_data_dev.md README.md semeval_articles_test util.py zips
Test
You can test your installation with keyphrase-extraction-example.py
$ python keyphrase-extraction-example.py
Also, see here for another example.
Requirements
- Python 3 (Tested: 3.6.5)
- nltk (with corpus) (Tested: 3.2.5)
- python-crfsuite (Tested: 0.9.5)
Optional
Notebooks
To run the noteooks in this repository install JupyterLab.
$ pip install jupyterlab
Then run the following command.
jupyter lab
Further information
This method uses a CRFs model (Conditional Random Fields) to label keyphrases in text, the model is trained with keyphrase candidates filtered with Part-of-Spech tag sequences. It is based on the method described here, but with a better performance. Please, feel free to send us comments or questions.
In this version we use python-crfsuite.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for kleis-keyphrase-extraction-0.1.1.dev1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5bff31df7a83fe5ae75055396ec74db8318c6a010e40a11f6d067641997cfef8 |
|
MD5 | 7831fca7914a49aa8f8a0895c125b1a8 |
|
BLAKE2b-256 | 5b6686124ea493fd222bde59ede8a49118ec75e9404b9233e9155fb2a387167a |
Hashes for kleis_keyphrase_extraction-0.1.1.dev1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a693ae9e32eaf17ba8e49045d7fd628ffef0c35f3c619766e7ce98552f8c7187 |
|
MD5 | f0834a97e6dda022b677ef7713875243 |
|
BLAKE2b-256 | 5c0ca24f149cde09d9ed9c6a15a8ecc8a310d9e54c3e7f6169e71252d1863726 |