Keyword extraction Python package
Project description
Yet Another Keyword Extractor (Yake)
Unsupervised Approach for Automatic Keyword Extraction using Text Features.
YAKE! is a light-weight unsupervised automatic keyword extraction method which rests on text statistical features extracted from single documents to select the most important keywords of a text. Our system does not need to be trained on a particular set of documents, neither it depends on dictionaries, external-corpus, size of the text, language or domain. To demonstrate the merits and the significance of our proposal, we compare it against ten state-of-the-art unsupervised approaches (TF.IDF, KP-Miner, RAKE, TextRank, SingleRank, ExpandRank, TopicRank, TopicalPageRank, PositionRank and MultipartiteRank), and one supervised method (KEA). Experimental results carried out on top of twenty datasets (see Benchmark section below) show that our methods significantly outperform state-of-the-art methods under a number of collections of different sizes, languages or domains. In addition to the python package here described, we also make available a demo, an API and a mobile app.
Main Features
- Unsupervised approach
- Corpus-Independent
- Domain and Language Independent
- Single-Document
Where can I find YAKE!?
YAKE! is available online [http://yake.inesctec.pt], as an open source Python package [https://github.com/LIAAD/yake] and on Google Play.
References
Please cite the following works when using YAKE
In-depth journal paper at Information Sciences Journal
Campos, R., Mangaravite, V., Pasquali, A., Jatowt, A., Jorge, A., Nunes, C. and Jatowt, A. (2020). YAKE! Keyword Extraction from Single Documents using Multiple Local Features. In Information Sciences Journal. Elsevier, Vol 509, pp 257-289. pdf
ECIR'18 Best Short Paper
Campos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018). A Text Feature Based Automatic Keyword Extraction Method for Single Documents. In: Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds). Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 684 - 691. pdf
Campos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018). YAKE! Collection-independent Automatic Keyword Extractor. In: Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds). Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 806 - 810. pdf
Awards
ECIR'18 Best Short Paper
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file yake-0.4.3.tar.gz
.
File metadata
- Download URL: yake-0.4.3.tar.gz
- Upload date:
- Size: 401.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc8ecda6902ba2703b6f80c448fc1c79884510ff45e4f7c89c5a88521c0a789d |
|
MD5 | 2e8548dad277b36f9744466ca2689ba2 |
|
BLAKE2b-256 | b8976f7725f96ec88703b0b4b97ee875bc40e6cb2d2559e20824c1af0537303e |
File details
Details for the file yake-0.4.3-py2.py3-none-any.whl
.
File metadata
- Download URL: yake-0.4.3-py2.py3-none-any.whl
- Upload date:
- Size: 59.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d64edd26269b38c3499dd9db6d931bd9b252cff6a332eb273c8541781769e6f |
|
MD5 | e5e48dc78e2259e18e43afbfc4bb0ebc |
|
BLAKE2b-256 | c3c9998809d67c7a6f1faba5e97b93f7deec5483a1e510443ca2892959041bb4 |