Run a function over every row of a CSV — with progress, header validation, and structured per-row errors.
Project description
tha-csv-runner
A small Python library that runs a function against every row of a CSV — with a progress bar, required header validation, and structured error capture per row.
Install
pip install tha-csv-runner
Quick start
from tha_csv_runner import Runner
def process(row: dict) -> None:
"""Raise any exception to mark the row as an error. Return value is ignored."""
if not row["email"].endswith("@example.com"):
raise ValueError("invalid email domain")
runner = Runner(
input_path="data.csv",
required_headers=["name", "email"],
processor=process,
)
runner.run()
runner.write("output.csv")
How it works
- Opens the CSV and validates that all
required_headersare present — raises immediately if any are missing - Iterates every row with a
tqdmprogress bar - Calls your
processor(row)function — if it raises, that row is marked as an error and processing continues - Appends three columns to every row:
row_number,row_status, andmessage- On success:
row_statusandmessageare blank - On error:
row_status = "error",message = str(exception)
- On success:
write()writes all rows (success and error) to a CSV
API
Runner
Runner(
input_path="data.csv", # path to input CSV
required_headers=["a", "b"], # columns that must exist — raises ConfigError if missing
processor=my_func, # optional: callable(row: dict) -> None
sample=100, # optional: process only the first N rows
)
runner.run()
Reads and processes all rows. Results are stored in runner.rows as a list of dicts.
runner.write()
runner.write(
output_path="output.csv", # optional — auto-named input_processed_TIMESTAMP.csv if omitted
sort_by="name", # optional — column name, or list of column names
ascending=True, # optional — bool or list of bools matching sort_by
column_order=["name", "email"], # optional — listed columns come first, rest follow
keep=["name", "email"], # optional — keep only these columns (mutually exclusive with drop)
drop=["row_number"], # optional — remove these columns (mutually exclusive with keep)
)
Returns the Path that was written.
CLI
tha-csv-runner run \
--input data.csv \
--processor my_module:process_row \
--header name \
--header email \
--sample 100
--processor uses the module:function convention. --header is repeatable. All flags are optional except --input.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tha_csv_runner-0.1.0.tar.gz.
File metadata
- Download URL: tha_csv_runner-0.1.0.tar.gz
- Upload date:
- Size: 33.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86becbbe0bc0449c3288d24ea0d3de4fd4ba7029e19037972fa36d6487691095
|
|
| MD5 |
eb535986ee29e4472adf78d63cee48d8
|
|
| BLAKE2b-256 |
74094cd0df2add34ac7d8df675cd980a8485021bd21ee23f970b4e6569995cf5
|
Provenance
The following attestation bundles were made for tha_csv_runner-0.1.0.tar.gz:
Publisher:
publish.yml on tha-guy-nate/tha-csv-runner
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tha_csv_runner-0.1.0.tar.gz -
Subject digest:
86becbbe0bc0449c3288d24ea0d3de4fd4ba7029e19037972fa36d6487691095 - Sigstore transparency entry: 1508415906
- Sigstore integration time:
-
Permalink:
tha-guy-nate/tha-csv-runner@458d797ab339bd40b3c70b3cf5e7364db8e249fe -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/tha-guy-nate
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@458d797ab339bd40b3c70b3cf5e7364db8e249fe -
Trigger Event:
push
-
Statement type:
File details
Details for the file tha_csv_runner-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tha_csv_runner-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c715a594875dbb138e93c205c8359cafca0035e9a77dab1617513d090903378
|
|
| MD5 |
7b4c2a854cc76b02d85807a496be82b9
|
|
| BLAKE2b-256 |
5f6467d86d82f8bbb238123c91f9b08987ddf6a0d4cd4a8dfce15bfdcbbb2dab
|
Provenance
The following attestation bundles were made for tha_csv_runner-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on tha-guy-nate/tha-csv-runner
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tha_csv_runner-0.1.0-py3-none-any.whl -
Subject digest:
1c715a594875dbb138e93c205c8359cafca0035e9a77dab1617513d090903378 - Sigstore transparency entry: 1508416000
- Sigstore integration time:
-
Permalink:
tha-guy-nate/tha-csv-runner@458d797ab339bd40b3c70b3cf5e7364db8e249fe -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/tha-guy-nate
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@458d797ab339bd40b3c70b3cf5e7364db8e249fe -
Trigger Event:
push
-
Statement type: