Python package for keyphrase labeling.
Project description
Kleis: Python package for keyphrase extraction
Kleis is a python package to label keyphrases in scientific text. It is named after the ancient greek word κλείς.
Install
Pip (Easy and quick)
$ pip install kleis-keyphrase-extraction
Make your own wheel
$ git clone https://github.com/sdhdez/kleis-keyphrase-extraction.git
$ cd kleis-keyphrase-extraction/
$ python setup.py sdist bdist_wheel
$ pip install dist/kleis_keyphrase_extraction-0.1.X.devX-py3-none-any.whl
Replace X with the corresponding values.
Note: This method doesn't include pre-trained models, you should download the corpus so it can train.
Usage
Example here
Datasets
Thepackage already includes some pre-trained models but if you want to test by your own you should download the datasets.
Download from SemEval 2017 Task 10 and decompress in "~/kleis_data/corpus/semeval2017-task10" or "./kleis_data/corpus/semeval2017-task10"
$ ls ~/kleis_data/corpus/semeval2017-task10
brat_config eval.py __MACOSX README_data.md scienceie2017_test_unlabelled train2 xml_utils.py
dev eval_py27.py README_data_dev.md README.md semeval_articles_test util.py zips
Test
You can test your installation with keyphrase-extraction-example.py
$ python keyphrase-extraction-example.py
Also, see here for another example.
Requirements
- Python 3 (Tested: 3.6.5)
- nltk (with corpus) (Tested: 3.2.5)
- python-crfsuite (Tested: 0.9.5)
Optional
Notebooks
To run the noteooks in this repository install JupyterLab.
$ pip install jupyterlab
Then run the following command.
jupyter lab
Further information
This method uses a CRFs model (Conditional Random Fields) to label keyphrases in text, the model is trained with keyphrase candidates filtered with Part-of-Spech tag sequences. It is based on the method described here, but with a better performance. Please, feel free to send us comments or questions.
In this version we use python-crfsuite.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file kleis-keyphrase-extraction-0.2a1.dev3.tar.gz
.
File metadata
- Download URL: kleis-keyphrase-extraction-0.2a1.dev3.tar.gz
- Upload date:
- Size: 23.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20b29e7797372f980beec574f60b6dc8f64310e40bc016644ac9501012f2d970 |
|
MD5 | 55b29578a620416e72777ecdaa8f21b8 |
|
BLAKE2b-256 | be0548e6cad4857a51a8a2573637408d48c42f0ac77901679ef1727793445f52 |
File details
Details for the file kleis_keyphrase_extraction-0.2a1.dev3-py3-none-any.whl
.
File metadata
- Download URL: kleis_keyphrase_extraction-0.2a1.dev3-py3-none-any.whl
- Upload date:
- Size: 37.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f53814cf421ab715357011447ed69aec3ede112e3e5b7cad7b672fc301b0d1d |
|
MD5 | 370f9edfe23bfa61cd19acbf84d60249 |
|
BLAKE2b-256 | 3b2b06feeeddb2f142189a89161314962078d68e268ba27c138c1f510e486390 |