Functions utils to perform data processing
Project description
dataprocess
dataprocess é um pacote Python que oferece utilitários simples e eficientes para o processamento e a limpeza de dados.
Recursos
- Processamento de dados: Transforme dados utilizando funções dedicadas.
- Limpeza de dados: Remova valores nulos e prepare dados para análise.
- Estrutura modular para fácil extensão.
Instalação
Instale o pacote diretamente do repositório GitHub:
pip install etl-dataprocess
ou
git clone https://github.com/botlorien/dataprocess.git
cd dataprocess
pip install .
Exemplo de uso
from dataprocess import dataprocessing as hd
if __name__ == '__main__':
def process_something_here():
"""Only a single example to use dataprocess"""
# handle importation files verifying if .xlsx, .csv, .xls, .json, .txt
# and returning its content as 'DataFrame' to (.xlsx, .csv, .xls), 'dict' to (.json) and 'str' to .txt
# if only the directory folder was passed as argument it get the first file in that folder
table = hd.import_file(PATH_DOWNLOADS)
# clear all table removing white spaces and another trashes
# and return a 'DataFrame' with all columns astype('str')
table = hd.clear_table(table)
# Now after the cleaning convert the columns to the apropriate types
# it accepts a mapping argument "dtypes" to list columns to be cast to
# 'datetime' and 'time'. Another common types as 'int', 'float' and 'str' are
# handled automatically analysing its values.
dtype = {
'datetime':[
'date_name_column' # replace it with the name of the column to be cast do 'datetime'
],
'time':[
'hour_and_minute_name_column' # replace it with the name of the column to be cast do 'time'
]
}
table = hd.convert_table_types(
table,
dtypes=dtype
)
print(table)
print(table.info())
return table
process_something_here()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
etl_dataprocess-0.2.3.tar.gz
(17.1 kB
view details)
File details
Details for the file etl_dataprocess-0.2.3.tar.gz.
File metadata
- Download URL: etl_dataprocess-0.2.3.tar.gz
- Upload date:
- Size: 17.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d41510452c68ca83900731fe7f7cad68234296ba520a1887b017a56a37c63f68
|
|
| MD5 |
4c99c3182c00363aa76f8c168998b10b
|
|
| BLAKE2b-256 |
8d348efcb642ef34118ca55f66dcd51651170e235d7cd74f9d3b9002c4fb27ce
|