Skip to main content

Lemmatizer for spanish language

Project description

# reTexto
Fast text processing for python

### Run

cd /[project_path]
docker build -t retext .
docker run -v $(pwd):/retext:rw -it retext bash

### Test

invoke test

### Work in

docker run -v $(pwd):/jiazz:rw -it jiazz bash

## Basic Use

if __name__ == '__main__':
s = '@Edux87, i need this www.google.com | https://github.com <br> \
<strong>UserName: çarlos </strong> \
i\'m from Perú 😛 \
#Friends #Text jajajajaja so fffunny \
loooveee thiiis 😌😎 \
@florenciaflor19 Si!!! sé vo… 🐷JUANA🐷 \
smile! haha jejeje jojojo jujuju jijijijajaja 😂'

text = ReTexto(s)
s = text.remove_html() \
.remove_mentions() \
.remove_tags() \
.remove_smiles(by='SMILING') \
.convert_specials() \
.convert_emoji() \
.remove_nochars(preserve_tilde=True) \
.remove_url() \
.remove_duplicate(r='a-jp-z') \
.remove_duplicate_vowels() \
.remove_duplicate_consonants() \
.remove_punctuation() \
.remove_multispaces() \
.lower() \
.remove_stopwords() \
.split_words(uniques=True)
print(s)
['username', 'from', 'love', 'i', 'ned', 'funy', 'juana', 'vo', 'this', 'si', 'im', 'se', 'peru', 'smile', 'so', 'smiling', 'carlos']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

retexto-1.3.tar.gz (24.3 kB view details)

Uploaded Source

File details

Details for the file retexto-1.3.tar.gz.

File metadata

  • Download URL: retexto-1.3.tar.gz
  • Upload date:
  • Size: 24.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for retexto-1.3.tar.gz
Algorithm Hash digest
SHA256 f6f32539fefda319949ba5d68bb7e0ff956309e63dc2331b9ed74c4e95335278
MD5 b98718f95284c4960d4bfb67c37fbd4a
BLAKE2b-256 bb518f0a3f48d9e6a11742b3c2b771449bfb79e93fc79995bb69750c954ad777

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page