Skip to main content

Keyword extraction Python package

Project description

Yet Another Keyword Extractor (Yake)

Unsupervised Approach for Automatic Keyword Extraction using Text Features.

YAKE! is a light-weight unsupervised automatic keyword extraction method which rests on text statistical features extracted from single documents to select the most important keywords of a text. Our system does not need to be trained on a particular set of documents, neither it depends on dictionaries, external-corpus, size of the text, language or domain. To demonstrate the merits and the significance of our proposal, we compare it against ten state-of-the-art unsupervised approaches (TF.IDF, KP-Miner, RAKE, TextRank, SingleRank, ExpandRank, TopicRank, TopicalPageRank, PositionRank and MultipartiteRank), and one supervised method (KEA). Experimental results carried out on top of twenty datasets (see Benchmark section below) show that our methods significantly outperform state-of-the-art methods under a number of collections of different sizes, languages or domains. In addition to the python package here described, we also make available a demo, an API and a mobile app.

Main Features

  • Unsupervised approach
  • Corpus-Independent
  • Domain and Language Independent
  • Single-Document

Where can I find YAKE!?

YAKE! is available online [http://yake.inesctec.pt], as an open source Python package [https://github.com/LIAAD/yake] and on Google Play.

References

Please cite the following works when using YAKE

In-depth journal paper at Information Sciences Journal

Campos, R., Mangaravite, V., Pasquali, A., Jatowt, A., Jorge, A., Nunes, C. and Jatowt, A. (2020). YAKE! Keyword Extraction from Single Documents using Multiple Local Features. In Information Sciences Journal. Elsevier, Vol 509, pp 257-289. pdf

ECIR'18 Best Short Paper

Campos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018). A Text Feature Based Automatic Keyword Extraction Method for Single Documents. In: Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds). Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 684 - 691. pdf

Campos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018). YAKE! Collection-independent Automatic Keyword Extractor. In: Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds). Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 806 - 810. pdf

Awards

ECIR'18 Best Short Paper

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yake-0.4.6.tar.gz (402.0 kB view details)

Uploaded Source

Built Distribution

yake-0.4.6-py2.py3-none-any.whl (60.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file yake-0.4.6.tar.gz.

File metadata

  • Download URL: yake-0.4.6.tar.gz
  • Upload date:
  • Size: 402.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.3

File hashes

Hashes for yake-0.4.6.tar.gz
Algorithm Hash digest
SHA256 9f5e02f6f541a2e3047060b2d0a47b216c239ec93aab353c444ba2272d18c9ab
MD5 5edb96ff0eaecf66e91cdad41803b014
BLAKE2b-256 ae618bb2a8e40ed97287f9e2840ccb6f0e7668329f70e4e5b0ff4afa1da7a15b

See more details on using hashes here.

File details

Details for the file yake-0.4.6-py2.py3-none-any.whl.

File metadata

  • Download URL: yake-0.4.6-py2.py3-none-any.whl
  • Upload date:
  • Size: 60.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.3

File hashes

Hashes for yake-0.4.6-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5e7a841b4cb2e3da5606c4ea3fc5004c1e255749ab14b23ae51d77188118849f
MD5 96a360ca0830f3cb458c44279d539ced
BLAKE2b-256 e8afd6de3c1a3808e6158a27a0034ecd3bfe6212ae2a4e4df70fe6f26ba796e9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page