Pacote de pré-processamento de texto
Project description
DataAnalysis
DataAnalysis é uma biblioteca que pode ser usada para o pré-processamento de um arquivo csv.
Parâmetros:
input_file: nome do arquivo com a extensão csv
api_small_talks: url da api de small talks
content_column: nome ou Ãndice da coluna de conteúdo do arquivo csv
encoding: codificação do arquivo
sep: separador usado no arquivo
batch: número de batches para usar na api de small talks
Installation
Use o gerenciador de pacotes pip para instalar o DataAnalysis
pip install DataAnalysis
Usage
import DataAnalysis as da
p = da.PreProcessing(input_file, api_small_talks, content_column, encoding, sep, batch)
p.process(output_file, lower = True, punctuation = True, abbreviation = True, typo = True, small_talk = True,
emoji = True, wa_emoji = True, accentuation = True, number = True, relevant = False, cpf = True,
url = True, email = True, money = True, code = True, time = True, date = True, tagging = True)
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
DataAnalysis-0.0.6.tar.gz
(6.1 kB
view details)
File details
Details for the file DataAnalysis-0.0.6.tar.gz
.
File metadata
- Download URL: DataAnalysis-0.0.6.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 79750cac81217399a8f214c7f228023afa13f8a5774d016e3cb59b193d5d3c64 |
|
MD5 | 27c381974785dc66120e35bcbe0c6500 |
|
BLAKE2b-256 | 320f8b4483d0637e2a0f580ca1b81595b0761d2c767cc729db5ae78e289c50c2 |