Functions utils to perform data processing
Project description
dataprocess
dataprocess é um pacote Python que oferece utilitários simples e eficientes para o processamento e a limpeza de dados.
Recursos
- Processamento de dados: Transforme dados utilizando funções dedicadas.
- Limpeza de dados: Remova valores nulos e prepare dados para análise.
- Estrutura modular para fácil extensão.
Instalação
Instale o pacote diretamente do repositório GitHub:
pip install etl-dataprocess
ou
git clone https://github.com/botlorien/dataprocess.git
cd dataprocess
pip install .
Exemplo de uso
from dataprocess import dataprocessing as hd
if __name__ == '__main__':
def process_something_here():
"""Only a single example to use dataprocess"""
# handle importation files verifying if .xlsx, .csv, .xls, .json, .txt
# and returning its content as 'DataFrame' to (.xlsx, .csv, .xls), 'dict' to (.json) and 'str' to .txt
# if only the directory folder was passed as argument it get the first file in that folder
table = hd.import_file(PATH_DOWNLOADS)
# clear all table removing white spaces and another trashes
# and return a 'DataFrame' with all columns astype('str')
table = hd.clear_table(table)
# Now after the cleaning convert the columns to the apropriate types
# it accepts a mapping argument "dtypes" to list columns to be cast to
# 'datetime' and 'time'. Another common types as 'int', 'float' and 'str' are
# handled automatically analysing its values.
dtype = {
'datetime':[
'date_name_column' # replace it with the name of the column to be cast do 'datetime'
],
'time':[
'hour_and_minute_name_column' # replace it with the name of the column to be cast do 'time'
]
}
table = hd.convert_table_types(
table,
dtypes=dtype
)
print(table)
print(table.info())
return table
process_something_here()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
etl_dataprocess-0.2.7.tar.gz
(17.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file etl_dataprocess-0.2.7.tar.gz.
File metadata
- Download URL: etl_dataprocess-0.2.7.tar.gz
- Upload date:
- Size: 17.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
30456021376db4f45e8de4a52399819429291df40fb11de0eabee881c2470cc5
|
|
| MD5 |
ef7f28e8f1eec7d7bdd4627041c069c5
|
|
| BLAKE2b-256 |
b56aaf8108192d66e8a1c48670e1b9445bca75c193eccf18732a4ea442c58b41
|
File details
Details for the file etl_dataprocess-0.2.7-py3-none-any.whl.
File metadata
- Download URL: etl_dataprocess-0.2.7-py3-none-any.whl
- Upload date:
- Size: 16.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2be9a56394fc3b08cbbdbbce94bbaa5f07b03e3915e357f2521f72e014a55107
|
|
| MD5 |
7e817f3336f5ccf1a617ee785f46430f
|
|
| BLAKE2b-256 |
dc4cb13e092cd5490d0a85824a2359eea8eb2faa012cc6315664bc617830e36d
|