Skip to main content

Small (magickal) Tweet Processor

Project description

Magick Tweet Preprocessor Vihaus Ljovan

Magick Tweet processor is a small program that does some NLP-magick on tweet-strings. It comes with a cli-interface on which the language (english or spanish) can be chosen as well as what kinds of modifications on the original string (tokenisation, hiding URLs, hiding @-mentions etc.) the program should undertake.

We used the MIT Licence because "I want it simple and permissive" sounded perfect for our usecase. Also we read through the LICENSE.txt and it sounded good to us.

Example: processing tweets in file 'tweets.txt' without emoji-removal but with stopword-, hashtag- and url-removal as well as anonymization of mentions:

tpp --file tweets.txt --no_emoji_removal

All possible flags:

  -h, --help                 Show this help message and exit
  -f, --file                 Use file(s) instead of string.
  -u, --no_url_removal       Process without url-removal
  -E, --no_emoji_removal     Process without emoji-removal
  -H, --no_hashtag_removal   Process without hastag-removal
  -a, --no_anonymize         Process without anonymization
  -S, --no_stopword_removal  Process without stopword-removal
  -e, --english              Set Language to english (already default)
  -s, --spanish              Set Language to spanish

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

magick_tweet_preprocessor-0.0.1.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file magick_tweet_preprocessor-0.0.1.tar.gz.

File metadata

  • Download URL: magick_tweet_preprocessor-0.0.1.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9

File hashes

Hashes for magick_tweet_preprocessor-0.0.1.tar.gz
Algorithm Hash digest
SHA256 04ee7690532c026bee9d9d57b58b7b1736d8d9f41e988fc581b3a9cb227ce90a
MD5 0206c268e369c3c6c2a07548b7ba4628
BLAKE2b-256 e40c2e184883a3c7ce9c667f07e4bacc36a003a22802eec219d472f107937d10

See more details on using hashes here.

File details

Details for the file magick_tweet_preprocessor-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: magick_tweet_preprocessor-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9

File hashes

Hashes for magick_tweet_preprocessor-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8a9c98a4c91d6aab08302985c77a9a252d547676c16fce2fec260efe893c010f
MD5 2434fa95f5f72dca2047538ce56a893f
BLAKE2b-256 b44dd18b86a0cb0fc4b5f91db0a701ff3d32c6895694b2900c1607a1ac4d9cdf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page