No project description provided
Twittenizer: Convert stupid text in smart features.
What does this lib do?
A tokenizer created specifically for messages posted on Twitter ( known as tweets). The constraints imposed by Twitter during the writing of messages force the users not to follow typographical standards. The purpose of this tokenizer is to reduce as much as possible the noise induced by the constraints while keeping as much of the information available in the tweet as possible.
It is build on top of NTLK Twitter tokenizer.
pip install twittenizer
>>> from twittenizer import Tokenizer >>> tokenizer = Tokenizer() >>> tokenizer.tokenize("Here is my website: https://t.co/EZWeDhjl, check it out! ") ['Here', 'is', 'my', 'website', 'check', 'it', 'out']
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size twittenizer-0.0.5-py3-none-any.whl (20.7 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size twittenizer-0.0.5.tar.gz (10.4 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for twittenizer-0.0.5-py3-none-any.whl