Python package for keyphrase labeling.
Project description
Kleis: Python package for keyphrase extraction
Kleis is a python package to label keyphrases in scientific text. It is named after the ancient greek word κλείς.
Install
Pip (Easy and quick)
$ pip install kleis-keyphrase-extraction
Make your own wheel
$ git clone https://github.com/sdhdez/kleis-keyphrase-extraction.git
$ cd kleis-keyphrase-extraction/
$ python setup.py sdist bdist_wheel
$ pip install dist/kleis_keyphrase_extraction-0.1.X.devX-py3-none-any.whl
Replace X with the corresponding values.
Note: This method doesn't include pre-trained models, you should download the corpus so it can train.
Usage
Example here
Datasets
Thepackage already includes some pre-trained models but if you want to test by your own you should download the datasets.
Download from SemEval 2017 Task 10 and decompress in "~/kleis_data/corpus/semeval2017-task10" or "./kleis_data/corpus/semeval2017-task10"
$ ls ~/kleis_data/corpus/semeval2017-task10
brat_config eval.py __MACOSX README_data.md scienceie2017_test_unlabelled train2 xml_utils.py
dev eval_py27.py README_data_dev.md README.md semeval_articles_test util.py zips
Test
You can test your installation with keyphrase-extraction-example.py
$ python keyphrase-extraction-example.py
Also, see here for another example.
Requirements
- Python 3 (Tested: 3.6.5)
- nltk (with corpus) (Tested: 3.2.5)
- python-crfsuite (Tested: 0.9.5)
Optional
Notebooks
To run the noteooks in this repository install JupyterLab.
$ pip install jupyterlab
Then run the following command.
jupyter lab
Further information
This method uses a CRFs model (Conditional Random Fields) to label keyphrases in text, the model is trained with keyphrase candidates filtered with Part-of-Spech tag sequences. It is based on the method described here, but with a better performance. Please, feel free to send us comments or questions.
In this version we use python-crfsuite.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for kleis-keyphrase-extraction-0.2a1.dev3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20b29e7797372f980beec574f60b6dc8f64310e40bc016644ac9501012f2d970 |
|
MD5 | 55b29578a620416e72777ecdaa8f21b8 |
|
BLAKE2b-256 | be0548e6cad4857a51a8a2573637408d48c42f0ac77901679ef1727793445f52 |
Hashes for kleis_keyphrase_extraction-0.2a1.dev3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f53814cf421ab715357011447ed69aec3ede112e3e5b7cad7b672fc301b0d1d |
|
MD5 | 370f9edfe23bfa61cd19acbf84d60249 |
|
BLAKE2b-256 | 3b2b06feeeddb2f142189a89161314962078d68e268ba27c138c1f510e486390 |