Skip to main content

No project description provided

Project description

Twittenizer: Convert stupid text in smart features.

What does this lib do?

A tokenizer created specifically for messages posted on Twitter ( known as tweets). The constraints imposed by Twitter during the writing of messages force the users not to follow typographical standards. The purpose of this tokenizer is to reduce as much as possible the noise induced by the constraints while keeping as much of the information available in the tweet as possible.

It is build on top of NTLK Twitter tokenizer.

Installation

pip install twittenizer

Example

>>> from twittenizer import Tokenizer
>>> tokenizer = Tokenizer()
>>> tokenizer.tokenize("Here is my website: https://t.co/EZWeDhjl, check it out! ")
['Here',  'is', 'my', 'website', 'check', 'it', 'out']

Licence

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

twittenizer-0.0.5.tar.gz (10.4 kB view details)

Uploaded Source

Built Distribution

twittenizer-0.0.5-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file twittenizer-0.0.5.tar.gz.

File metadata

  • Download URL: twittenizer-0.0.5.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.3

File hashes

Hashes for twittenizer-0.0.5.tar.gz
Algorithm Hash digest
SHA256 5ddb129e574592af690d3799577c1859592ad8cbab022b49040e94c1e9d33732
MD5 a5fdb46aa2e30ab87e5805d26cbd845a
BLAKE2b-256 2a08693bfe8bb82c61af748418e615252b3ca0d263b0b74076b001dd2694aac5

See more details on using hashes here.

File details

Details for the file twittenizer-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: twittenizer-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.3

File hashes

Hashes for twittenizer-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 b582cb39f2b8ae29c66b2b435f78e4cd170532760c89610f92d61decd612dbe9
MD5 d33a327f0959e774aa7f76d1eebfb649
BLAKE2b-256 fa31eaa75874fcae53884c2c62d3698b1c9c96c42ca7dd394b50590b66e5d180

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page