Skip to main content

csv normalize to have always same output csv

Project description


Info


This is just a simple script that ensures all your .csv files have always same columns and in the same order. Probably one of the most common issues with .csv files:

  • Some system doesn't respects the columns orders
  • Some system doesn't adds a column when there is no data for such column

The script/program resolves both cases in a simple way, process:

Process: alt text

Normalize: ensure order of columns is always same, add missing columns with empty data.

Example, you have a meteorologic station that should always generate a .csv with the following columns

Temperature, Humidity, Radiation, Wind, Wind gust

But sometimes one of the sensors doesn't have data and instead of sending all the columns to the .csv it generates partial .csv

Temperature, Humidity, Wind, Wind gust

In this case the software that process the .csv could fail, so you can use the csv_normalizer to ensure the .csv file is always

Temperature, Humidity, Radiation, Wind, Wind gust

In this case the csv_normalizer will add the missing column with empty data. Also the csv_normalizer will ensure the order of the columns is always the same.

Returns always a dict/json like, with the 'ok' or 'fail' list of processed files. examples:

{'failed': [],
'ok': [
    {'export_path': 'C:\\temp\\csv_export\\business-financial-data-jun-2021-quarter.csv',
        'import_path': 'C:\\temp\\csv_import\\business-financial-data-jun-2021-quarter.csv'}
        ]
}

# Example when nothing was processed:
{'failed': [],
'ok': []}

Example config:


[common]
csv_import_folder = C:/temp/csv_import
csv_export_folder = C:/temp/csv_export
csv_export_headers = 'Series_reference', 'Period', 'ELEE'
csv_delimiter = ;
csv_encoding = utf-8
# You can use column types, like int64, np.float64 if you want to specify
 # Or you can use type object if you don't want conversion or avoid NaN errors
# https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
# example: {'column name': 'object'}
'dtype' = {}

Usage


usage: csv_normalizer [-h] [-c [CONFIG_INI]] [--version [VERSION]] [--no_rename [NO_RENAME_OLD]] [--write_config]

optional arguments:
-h, --help            show this help message and exit
-c [CONFIG_INI],      --config_ini [CONFIG_INI]
                        csv_normalizer ini configuration file
--version [VERSION]   Print version and exit
--no_rename [NO_RENAME_OLD]
                        Do not rename to .old the original file
--write_config        Write configuration with default values, useful to get a config file to modify

Example usage on Linux

csv_normalizer -c .\csv_normalizer.ini

On windows:

csv_normalizer.exe -c .\csv_normalizer.ini

Adding option to not rename the original files:

csv_normalizer -c .\csv_normalizer.ini --no_rename

By default csv_normalizer will rename the original files to .old so if you run the program again, it will not process same files again.


Install


pip install --user csv_normalizer

# or for root account

pip install csv_normalizer

Author


Author: Pablo Estigarribia

Project site: https://github.com/CoffeeITWorks/csv_normalizer

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv_normalizer-0.1.4.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

csv_normalizer-0.1.4-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file csv_normalizer-0.1.4.tar.gz.

File metadata

  • Download URL: csv_normalizer-0.1.4.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.0 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.5

File hashes

Hashes for csv_normalizer-0.1.4.tar.gz
Algorithm Hash digest
SHA256 32a7490386d7f192af7867a4e4b157ec08f66c6a4f7c48642375af5cd3410cef
MD5 aaeb482a79b5d8227f06e5b5fc307983
BLAKE2b-256 ab7942cbb8d39c7dae7240035e075c726784b0fb5b7a5e157cd594c270e38c66

See more details on using hashes here.

File details

Details for the file csv_normalizer-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: csv_normalizer-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 9.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.0 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.5

File hashes

Hashes for csv_normalizer-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 fe996eea8fc92bde58d00c8371877134adb89c7bec30576b9f32d4e697ce1ac7
MD5 b189337e8a003b5d00a5478d47daaff3
BLAKE2b-256 0f77c4f6620a2002392a30d1f78a0cf2c3f3068f04c35e6706b760ae691971c9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page