Fast text processing
Project description
Fast text processing for python
Run
cd /[project_path] docker build -t retexto . docker run -v $(pwd):/retexto:rw -it retexto bash
Basic Use
if __name__ == '__main__': s = '@Edux87, i need this www.google.com | https://github.com <br> \ <strong>UserName: çarlos </strong> \ i\'m from Perú 😛 \ #Friends #Text jajajajaja so fffunny \ loooveee thiiis 😌😎 \ @florenciaflor19 Si!!! sé vo… 🐷JUANA🐷 \ smile! haha jejeje jojojo jujuju jijijijajaja 😂' text = ReTexto(s) s = text.remove_html() \ .remove_mentions() \ .remove_tags() \ .remove_smiles(by='SMILING') \ .convert_specials() \ .convert_emoji() \ .remove_nochars(preserve_tilde=True) \ .remove_url() \ .remove_duplicate(r='a-jp-z') \ .remove_duplicate_vowels() \ .remove_duplicate_consonants() \ .remove_punctuation() \ .remove_multispaces() \ .lower() \ .remove_stopwords() \ .split_words(uniques=True) print(s) ['username', 'from', 'love', 'i', 'ned', 'funy', 'juana', 'vo', 'this', 'si', 'im', 'se', 'peru', 'smile', 'so', 'smiling', 'carlos']
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
retexto-1.6.1.tar.gz
(26.0 kB
view details)
Built Distribution
File details
Details for the file retexto-1.6.1.tar.gz
.
File metadata
- Download URL: retexto-1.6.1.tar.gz
- Upload date:
- Size: 26.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | daa0e34c80de90b64404f10e00f851f24c63186f6e62af3c90b3d0ee9ba3de05 |
|
MD5 | 2f94287ff4d88368ce4a8c6ab15d20ff |
|
BLAKE2b-256 | 36eb5d05d6c65cc715f0f0c03d809ef7972b2e05b9a712d84811569cd82b815f |
File details
Details for the file retexto-1.6.1-py2.py3-none-any.whl
.
File metadata
- Download URL: retexto-1.6.1-py2.py3-none-any.whl
- Upload date:
- Size: 50.7 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e5860ece62f264476f244f0fab4eefd88041d186ffebd7cb74cb79179855b1e |
|
MD5 | fb97e4e20b0b3c4757a32f6001ef0e5f |
|
BLAKE2b-256 | 49507ab2978004df996e373416b141c036c7ade565a027748050c2589e6ef873 |