A tokenizer for Twitter
Project description
Twittenizer: Convert stupid text in smart features.
The goal of this tool is to provide an advanced tokenizer for tweets. It is build on top of NLTK tokenizer.
Inputs:
- Tweets in .json format
- Raw tweets in .txt format
Output:
- Tokens ready to fit in a neural network
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
twittenizer-0.0.1.tar.gz
(5.5 kB
view hashes)
Built Distribution
Close
Hashes for twittenizer-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8b215429a2ffb15cb8831f8e71c34c757b726709487266dd4ec1b0a8689f49f |
|
MD5 | cb31e2a79719c909e7310be197c8f09c |
|
BLAKE2b-256 | 6db41188e02ef7d913c6863330922179f2a0c4633f48d6119f89dc918f73869f |