Skip to main content

Run a function over every row of a CSV — with progress, header validation, and structured per-row errors.

Project description

tha-csv-runner

CI

A small Python library that runs a function against every row of a CSV — with a progress bar, required header validation, and structured error capture per row.

Install

pip install tha-csv-runner

Quick start

from tha_csv_runner import ThaCSV

def process(row: dict) -> None:
    """Raise any exception to mark the row as an error. Return value is ignored."""
    if not row["email"].endswith("@example.com"):
        raise ValueError("invalid email domain")

runner = ThaCSV()

runner.read("Step 1 of 1", "data.csv", ["name", "email"], process)
runner.write("Step 1 of 1", "output.csv")

How it works

  1. Opens the CSV and validates that all required_headers are present — raises immediately if any are missing
  2. Iterates every row with a tqdm progress bar labelled with desc
  3. Calls your processor(row) function — if it raises, that row is marked as an error and processing continues
  4. Appends three columns to every row: row number, row status, and message
    • On success: row status and message are blank
    • On error: row status = "error", message = str(exception)
  5. write() writes all rows (success and error) to a CSV

API

ThaCSV

ThaCSV()

runner.read()

runner.read(
    "Step 2 of 10",          # progress bar label — pass None to use the filename
    "data.csv",              # path to input CSV
    ["a", "b"],              # columns that must exist — raises ConfigError if missing
    processor=my_func,       # optional: callable(row: dict) -> None
    sample=100,              # optional: process only the first N rows
    enrich=True,             # optional: set False to skip row number/status/message columns
)

Reads and processes all rows. Results are stored in runner.rows as a list of dicts.

When enrich=False, processor exceptions are re-raised instead of captured.

runner.write()

runner.write(
    "Step 10 of 10",                   # progress bar label — pass None to use the output filename
    output_path="output.csv",          # optional — auto-named input_processed_TIMESTAMP.csv if omitted
    sort_by="name",                    # optional — column name, or list of column names
    ascending=True,                    # optional — bool or list of bools matching sort_by
    column_order=["name", "email"],    # optional — listed columns come first, rest follow
    keep=["name", "email"],            # optional — keep only these columns (mutually exclusive with drop)
    drop=["row number"],               # optional — remove these columns (mutually exclusive with keep)
)

Returns the Path that was written.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tha_csv_runner-0.2.0.tar.gz (32.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tha_csv_runner-0.2.0-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file tha_csv_runner-0.2.0.tar.gz.

File metadata

  • Download URL: tha_csv_runner-0.2.0.tar.gz
  • Upload date:
  • Size: 32.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tha_csv_runner-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a01a3c1e06bb50950c70fac2d5a8d14a6d2bd684a8d27e563be9658a32717f35
MD5 4deff0fff8a654cb8ffd728c55ebe2d2
BLAKE2b-256 06568f7183d5e760980531da75630d5ea38e96503f2ae7daf5b7f23606311032

See more details on using hashes here.

Provenance

The following attestation bundles were made for tha_csv_runner-0.2.0.tar.gz:

Publisher: publish.yml on tha-guy-nate/tha-csv-runner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tha_csv_runner-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: tha_csv_runner-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 5.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tha_csv_runner-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b162a08ede7e90484e57f1442f1c635e816cd40fc0140905d1e22a57a759b119
MD5 08c59a95d5643934b32bde7229d73d1b
BLAKE2b-256 049b532154c051bdb0f4b2f6b2e29931386ae681ee1b00b39dbcc340121a74f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for tha_csv_runner-0.2.0-py3-none-any.whl:

Publisher: publish.yml on tha-guy-nate/tha-csv-runner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page