Skip to main content

A utility for cleaning and normalizing CSV files

Project description

csv-cleaner

A Python utility for cleaning and normalizing CSV files. Designed for preparing messy raw data (especially OutSystems exports) for analysis or database imports.

Features

  • Removes OutSystems N' prefix patterns
  • Strips stray quotes and trailing semicolons
  • Handles malformed rows with trailing delimiters
  • Pads incomplete rows to maintain column alignment
  • Supports batch processing of multiple CSV files
  • Configurable input/output delimiters

Installation

git clone https://github.com/ShawnaRStaff/csv-cleaner.git
cd csv-cleaner
python -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\activate
pip install -r requirements.txt

Usage

# Basic usage
python clean_csvs.py -i ./input_folder -o ./output_folder

# With custom delimiters
python clean_csvs.py -i ./input -o ./output --input-delimiter ";" --output-delimiter ","

# Verbose output
python clean_csvs.py -i ./input -o ./output -v

# Quiet mode (errors only)
python clean_csvs.py -i ./input -o ./output -q

Options

Option Description Default
-i, --input Input folder containing CSV files Required
-o, --output Output folder for cleaned files Required
--input-delimiter Delimiter in input files ;
--output-delimiter Delimiter in output files ,
-e, --encoding File encoding utf-8
-v, --verbose Enable verbose output Off
-q, --quiet Suppress all output except errors Off

Example

Input (data.csv):

"Name";"Age";"City";
"John";30;"New York";
N'Jane';25;  Los Angeles  ;
"Bob's Store";40;"Chicago";;

Output (data.csv):

Name,Age,City
John,30,New York
Jane,25,Los Angeles
Bobs Store,40,Chicago

Running Tests

pip install pytest
pytest tests/ -v

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clean_csv_tool-1.0.0.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clean_csv_tool-1.0.0-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file clean_csv_tool-1.0.0.tar.gz.

File metadata

  • Download URL: clean_csv_tool-1.0.0.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for clean_csv_tool-1.0.0.tar.gz
Algorithm Hash digest
SHA256 0eeb5d9fd60181b175d257b65984d1d6c8faf9801d33bb5b731c0dbf4a8c491e
MD5 942f3a604b11c1cf2428f457f64bb0e5
BLAKE2b-256 d598fa111011154880cb4e0b94fbaec8f92afd3958f9fb906b103e4350c9ee39

See more details on using hashes here.

File details

Details for the file clean_csv_tool-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: clean_csv_tool-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 5.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.1

File hashes

Hashes for clean_csv_tool-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 623f72327d0e60faa5a7676f03ab0f8f4d8c067afa5dbef8010c40899b8a5ba9
MD5 fc4e823b78b5964c27b1f5acde577b4a
BLAKE2b-256 212a2d789e2f162e6d4397875aea74180825e836e66f9dbb61798d5febbc71a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page