Skip to main content

A set of functions that uses sklearn to conduct a TF-IDF analysis to generate keywords from event-based / grouped textual corpus.

Project description

evekeys: Isolate keywords from an event-based and custom-grouped textual corpus

By Chris Lindgren

Distributed under the BSD 3-clause license. See LICENSE.txt or for details.



A set of functions that uses scikit-learn to conduct a TF-IDF analysis to isolate keywords from event-based documents. It answers the following questions:

  1. What keywords represent a particular period of content?
  2. What keywords represent a particular group of content from a particular period?

It assumes you have:

  • imported your corpus as a pandas DataFrame,
  • included metadata information, such as a list of dates and list of groups to reorganize your corpus, and
  • pre-processed your documents.

It functions only with Python 3.x and is not backwards-compatible.

Warning: evekeys performs little to no custom error-handling, so make sure your inputs are formatted properly. If you have questions, please let me know via email.

System requirements

  • pandas
  • sklearn
  • tqdm


pip install evekeys

Known Issues or Limitations

  • Please contact me if you discover any issues.

Example notebooks

  • Coming soon.

Distribution update terminal commands

# Create new distribution of code for archiving
sudo python sdist bdist_wheel

# Distribute to Python Package Index
python -m twine upload --repository-url dist/*

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for evekeys, version 0.0.2
Filename, size File type Python version Upload date Hashes
Filename, size evekeys-0.0.2-py3-none-any.whl (5.3 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size evekeys-0.0.2.tar.gz (3.9 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page