A utility for cleaning and normalizing CSV files
Project description
csv-cleaner
A Python utility for cleaning and normalizing CSV files. Designed for preparing messy raw data (especially OutSystems exports) for analysis or database imports.
Features
- Removes OutSystems N' prefix patterns
- Strips stray quotes and trailing semicolons
- Handles malformed rows with trailing delimiters
- Pads incomplete rows to maintain column alignment
- Supports batch processing of multiple CSV files
- Configurable input/output delimiters
Installation
git clone https://github.com/ShawnaRStaff/csv-cleaner.git
cd csv-cleaner
python -m venv venv
source venv/bin/activate # On Windows: .\venv\Scripts\activate
pip install -r requirements.txt
Usage
# Basic usage
python clean_csvs.py -i ./input_folder -o ./output_folder
# With custom delimiters
python clean_csvs.py -i ./input -o ./output --input-delimiter ";" --output-delimiter ","
# Verbose output
python clean_csvs.py -i ./input -o ./output -v
# Quiet mode (errors only)
python clean_csvs.py -i ./input -o ./output -q
Options
| Option | Description | Default |
|---|---|---|
-i, --input |
Input folder containing CSV files | Required |
-o, --output |
Output folder for cleaned files | Required |
--input-delimiter |
Delimiter in input files | ; |
--output-delimiter |
Delimiter in output files | , |
-e, --encoding |
File encoding | utf-8 |
-v, --verbose |
Enable verbose output | Off |
-q, --quiet |
Suppress all output except errors | Off |
Example
Input (data.csv):
"Name";"Age";"City";
"John";30;"New York";
N'Jane';25; Los Angeles ;
"Bob's Store";40;"Chicago";;
Output (data.csv):
Name,Age,City
John,30,New York
Jane,25,Los Angeles
Bobs Store,40,Chicago
Running Tests
pip install pytest
pytest tests/ -v
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
clean_csv_tool-1.0.0.tar.gz
(7.7 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file clean_csv_tool-1.0.0.tar.gz.
File metadata
- Download URL: clean_csv_tool-1.0.0.tar.gz
- Upload date:
- Size: 7.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0eeb5d9fd60181b175d257b65984d1d6c8faf9801d33bb5b731c0dbf4a8c491e
|
|
| MD5 |
942f3a604b11c1cf2428f457f64bb0e5
|
|
| BLAKE2b-256 |
d598fa111011154880cb4e0b94fbaec8f92afd3958f9fb906b103e4350c9ee39
|
File details
Details for the file clean_csv_tool-1.0.0-py3-none-any.whl.
File metadata
- Download URL: clean_csv_tool-1.0.0-py3-none-any.whl
- Upload date:
- Size: 5.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
623f72327d0e60faa5a7676f03ab0f8f4d8c067afa5dbef8010c40899b8a5ba9
|
|
| MD5 |
fc4e823b78b5964c27b1f5acde577b4a
|
|
| BLAKE2b-256 |
212a2d789e2f162e6d4397875aea74180825e836e66f9dbb61798d5febbc71a0
|