Skip to main content

Tool for quick random and systematic changes (jumbling) to .csv files.

Project description

Tool for jumbling csv data files.
Currently it is expected that the CSV file used comma seperation and full-stop
decimals.

Features:
See the program help message below.

Author:
Rasmus Klitgaard
Feel free to use the program - it is licenced under GPL-v3.0, so
do keep uses to within the licence.
If you use the program for professional work or scientific publications
please provide a reference to the gitlab page.

The source code is available at https://gitlab.com/RasmusKlitgaard/jumblecsv.



usage: jumblecsv [-h] [-p JUMBLING_PERCENT] [-c CATEGORICAL_SWITCH_PROBABILITY]
[-d [DROP_COLUMNS ...]] [-l [CATEGORICAL_COLUMNS ...]]
[-o OUTPUT_FILE] [--not-all-categorical-parameters-present]
[--block-negative] [-n NUMBER_OF_HEADER_ROWS]
[--significant-figures SIGNIFICANT_FIGURES]
[--float-formatting FLOAT_FORMATTING]
[--int-formatting INT_FORMATTING]
csv_path

Tool for jumbling data, removing data and reformatting data in CSV format.

positional arguments:
csv_path Path to the .csv file. Note the file has to be comma
seperated with full-stop decimals , not semicolon with
comma decimal.

options:
-h, --help show this help message and exit
-p JUMBLING_PERCENT, --jumbling-percent JUMBLING_PERCENT
Percentage to jumble non-categorical values in %
-c CATEGORICAL_SWITCH_PROBABILITY, --categorical-switch-probability CATEGORICAL_SWITCH_PROBABILITY
Probability to change a categorical parameter in %
-d [DROP_COLUMNS ...], --drop-columns [DROP_COLUMNS ...]
List of column indices to drop in the new table
-l [CATEGORICAL_COLUMNS ...], --categorical-columns [CATEGORICAL_COLUMNS ...]
List of column indices containing a categorical
parameter
-o OUTPUT_FILE, --output-file OUTPUT_FILE
Write the resulting CSV file to this path
--not-all-categorical-parameters-present
Set this if all possible values of categorical
parameters are not present in the data. If set data
will be interpolated, so we assume the outer values
are represented.
--block-negative Caps values at a minimum of 0
-n NUMBER_OF_HEADER_ROWS, --number-of-header-rows NUMBER_OF_HEADER_ROWS
Number of header rows in the input CSV file
--significant-figures SIGNIFICANT_FIGURES
Number of significant figures to use when printing
floats. If neither '--significant-figures', '--float-
formatting' is set, the values will be represented
centrally, as wide as the header of the column
--float-formatting FLOAT_FORMATTING
Float formatting to use when printing. E.g. '4.2f', '
^8.2f'. Whatever is accepted by your python
interpreter 'print' function should work.
--int-formatting INT_FORMATTING
Integer formatting to use when printing. E.g. '4d', '
^8d'. Whatever is accepted by your python interpreter
'print' function should work. If not set the values
will be represented centrally, as wide as the header
of the column

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jumblecsv-0.1.5.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

jumblecsv-0.1.5-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file jumblecsv-0.1.5.tar.gz.

File metadata

  • Download URL: jumblecsv-0.1.5.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for jumblecsv-0.1.5.tar.gz
Algorithm Hash digest
SHA256 0414ebff039f66105ca13f07d73b553a6b62a6ba0a0f1492d5f35417001cc52b
MD5 06dc4e71fe8ad6d08955b20ecfa541d8
BLAKE2b-256 18422eee8c34ee10b03184db2c9515221c8be04bca51462d01c3f9f0b7c21e3f

See more details on using hashes here.

File details

Details for the file jumblecsv-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: jumblecsv-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for jumblecsv-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 cc08ca717a185c7256f5becf64aedff038e790619d901f47306a9ef52c0a4148
MD5 5bcf28164af815c79935d51b652182c3
BLAKE2b-256 cada0d86dfee925f42e107a51fda46ce73620674b1587082d1a5ef2c9524781f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page