Skip to main content

Run a function over every row of a CSV — with progress, header validation, and structured per-row errors.

Project description

tha-csv-runner

CI

A small Python library that runs a function against every row of a CSV — with a progress bar, required header validation, and structured error capture per row.

Install

pip install tha-csv-runner

Quick start

from tha_csv_runner import Runner

def process(row: dict) -> None:
    """Raise any exception to mark the row as an error. Return value is ignored."""
    if not row["email"].endswith("@example.com"):
        raise ValueError("invalid email domain")

runner = Runner(
    input_path="data.csv",
    required_headers=["name", "email"],
    processor=process,
)
runner.run()
runner.write("output.csv")

How it works

  1. Opens the CSV and validates that all required_headers are present — raises immediately if any are missing
  2. Iterates every row with a tqdm progress bar
  3. Calls your processor(row) function — if it raises, that row is marked as an error and processing continues
  4. Appends three columns to every row: row_number, row_status, and message
    • On success: row_status and message are blank
    • On error: row_status = "error", message = str(exception)
  5. write() writes all rows (success and error) to a CSV

API

Runner

Runner(
    input_path="data.csv",       # path to input CSV
    required_headers=["a", "b"], # columns that must exist — raises ConfigError if missing
    processor=my_func,           # optional: callable(row: dict) -> None
    sample=100,                  # optional: process only the first N rows
)

runner.run()

Reads and processes all rows. Results are stored in runner.rows as a list of dicts.

runner.write()

runner.write(
    output_path="output.csv",          # optional — auto-named input_processed_TIMESTAMP.csv if omitted
    sort_by="name",                    # optional — column name, or list of column names
    ascending=True,                    # optional — bool or list of bools matching sort_by
    column_order=["name", "email"],    # optional — listed columns come first, rest follow
    keep=["name", "email"],            # optional — keep only these columns (mutually exclusive with drop)
    drop=["row_number"],               # optional — remove these columns (mutually exclusive with keep)
)

Returns the Path that was written.

CLI

tha-csv-runner run \
    --input data.csv \
    --processor my_module:process_row \
    --header name \
    --header email \
    --sample 100

--processor uses the module:function convention. --header is repeatable. All flags are optional except --input.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tha_csv_runner-0.1.0.tar.gz (33.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tha_csv_runner-0.1.0-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file tha_csv_runner-0.1.0.tar.gz.

File metadata

  • Download URL: tha_csv_runner-0.1.0.tar.gz
  • Upload date:
  • Size: 33.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tha_csv_runner-0.1.0.tar.gz
Algorithm Hash digest
SHA256 86becbbe0bc0449c3288d24ea0d3de4fd4ba7029e19037972fa36d6487691095
MD5 eb535986ee29e4472adf78d63cee48d8
BLAKE2b-256 74094cd0df2add34ac7d8df675cd980a8485021bd21ee23f970b4e6569995cf5

See more details on using hashes here.

Provenance

The following attestation bundles were made for tha_csv_runner-0.1.0.tar.gz:

Publisher: publish.yml on tha-guy-nate/tha-csv-runner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tha_csv_runner-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tha_csv_runner-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tha_csv_runner-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1c715a594875dbb138e93c205c8359cafca0035e9a77dab1617513d090903378
MD5 7b4c2a854cc76b02d85807a496be82b9
BLAKE2b-256 5f6467d86d82f8bbb238123c91f9b08987ddf6a0d4cd4a8dfce15bfdcbbb2dab

See more details on using hashes here.

Provenance

The following attestation bundles were made for tha_csv_runner-0.1.0-py3-none-any.whl:

Publisher: publish.yml on tha-guy-nate/tha-csv-runner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page