Skip to main content

Generate Python dataclasses, loaders, from CSV/TSV files

Project description

CSV Dataclass Generator

Generate Python dataclasses and loader functions from CSV/TSV files.

Features

  • Automatic Type Inference: Detects int, float, and str types based on a sample of rows.
  • Dialect Detection: Automatically identifies CSV delimiters (including TSV).
  • Name Sanitization: Converts CSV column headers into valid Python identifiers.
  • Iterator Loading: Generates a loading generator that yields dataclass instances.

Installation

pip install csv-dataclass-gen

Usage

Command Line Interface (CLI)

The package provides a csv-dataclass-gen command.

# Generate code to stdout
csv-dataclass-gen data.csv

# Generate code and save to a directory
csv-dataclass-gen data.csv --output ./generated_models/

# Specify a custom class name and sample size for type inference
csv-dataclass-gen data.csv --name my_custom_data --sample-size 500

CLI Help Message:

Usage: csv-dataclass-gen [OPTIONS] INPUT_FILE

  Generate dataclass and loader code from CSV files.

Options:
  -o, --output TEXT          Output directory for generated files. "-" outputs
                             the result to stdout.
  -s, --sample-size INTEGER  Number of rows to sample for type inference
  -n, --name TEXT            Alternative name for the generated name. Snake
                             case / spaced words is recommended.
  --help                     Show this message and exit.

Arguments:

  • INPUT_FILE: Path to the CSV/TSV file (required).

Example Generated Code

Given a CSV like users.csv:

id,user_name,score
1,alice,95.5
2,bob,88.0

The generator will produce:

from dataclasses import dataclass
from pathlib import Path
from typing import Iterator
import csv

@dataclass
class Users:
    id: int  # Original: "id"
    user_name: str  # Original: "user_name"
    score: float  # Original: "score"

def load_users(csv_path: Path, max_rows: int | None = None, delimiter: str = ',') -> Iterator[Users]:
    # ... loading logic ...
    pass

Development

Code Template

The generator comes with a Dataclass and Loader template.

Dependencies

This project uses uv for dependency management.

uv sync --all-groups

Running Tests

We use pytest for testing.

uv run pytest

Tests include:

  • tests/test_name_sanitizer.py: Logic for sanitizing names into different formats.
  • tests/test_type_inferrer.py: Logic for detecting data types.
  • tests/test_csv_analyzer.py: Logic for CSV structure analysis.
  • tests/test_code_gen_e2e.py: End-to-end tests that generate code and verify it by reconstructing the original CSV.

License

MIT License – see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv_dataclass_gen-1.0.0.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csv_dataclass_gen-1.0.0-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file csv_dataclass_gen-1.0.0.tar.gz.

File metadata

  • Download URL: csv_dataclass_gen-1.0.0.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for csv_dataclass_gen-1.0.0.tar.gz
Algorithm Hash digest
SHA256 3235d528042013c8030e72d6417bb6cb481addb8b4b9345137660de0c1522aa9
MD5 2c90ae85cca6ca4744db32c1d397fe43
BLAKE2b-256 73acfa29747da5a0c794d8f304a3e133d895eeffc1060b1fc0da9d13011ad83d

See more details on using hashes here.

Provenance

The following attestation bundles were made for csv_dataclass_gen-1.0.0.tar.gz:

Publisher: publish.yml on khwong-c/csv-dataclass-gen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file csv_dataclass_gen-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for csv_dataclass_gen-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6af7e898f72e65742863d6bb3b2da2116af99e7bc7c08f7e4c383d360bd19d2b
MD5 19a10a929664797e67cbe0b4c51a9961
BLAKE2b-256 842792d7ecaa72d4f9f717cfaac1906e87c7f0893aedf10872cce14ee077a22a

See more details on using hashes here.

Provenance

The following attestation bundles were made for csv_dataclass_gen-1.0.0-py3-none-any.whl:

Publisher: publish.yml on khwong-c/csv-dataclass-gen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page