Fast text processing
Project description
Fast text processing for python
Run
cd /[project_path] docker build -t retexto . docker run -v $(pwd):/retexto:rw -it retexto bash
Basic Use
if __name__ == '__main__': s = '@Edux87, i need this www.google.com | https://github.com <br> \ <strong>UserName: çarlos </strong> \ i\'m from Perú 😛 \ #Friends #Text jajajajaja so fffunny \ loooveee thiiis 😌😎 \ @florenciaflor19 Si!!! sé vo… 🐷JUANA🐷 \ smile! haha jejeje jojojo jujuju jijijijajaja 😂' text = ReTexto(s) s = text.remove_html() \ .remove_mentions() \ .remove_tags() \ .remove_smiles(by='SMILING') \ .convert_specials() \ .convert_emoji() \ .remove_nochars(preserve_tilde=True) \ .remove_url() \ .remove_duplicate(r='a-jp-z') \ .remove_duplicate_vowels() \ .remove_duplicate_consonants() \ .remove_punctuation() \ .remove_multispaces() \ .lower() \ .remove_stopwords() \ .split_words(uniques=True) print(s) ['username', 'from', 'love', 'i', 'ned', 'funy', 'juana', 'vo', 'this', 'si', 'im', 'se', 'peru', 'smile', 'so', 'smiling', 'carlos']
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
retexto-1.6.1.tar.gz
(26.0 kB
view hashes)
Built Distribution
Close
Hashes for retexto-1.6.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e5860ece62f264476f244f0fab4eefd88041d186ffebd7cb74cb79179855b1e |
|
MD5 | fb97e4e20b0b3c4757a32f6001ef0e5f |
|
BLAKE2b-256 | 49507ab2978004df996e373416b141c036c7ade565a027748050c2589e6ef873 |