Skip to main content

Generate Python dataclasses, loaders, from CSV/TSV files

Project description

CSV Dataclass Generator

Generate Python dataclasses and loader functions from CSV/TSV files.

Features

  • Automatic Type Inference: Detects int, float, and str types based on a sample of rows.
  • Dialect Detection: Automatically identifies CSV delimiters (including TSV).
  • Name Sanitization: Converts CSV column headers into valid Python identifiers.
  • Iterator Loading: Generates a loading generator that yields dataclass instances, suitable for larger datasets.

Installation

pip install csv-dataclass-gen

Usage

Command Line Interface (CLI)

The package provides a csv-dataclass-gen command.

# Generate code to stdout
csv-dataclass-gen data.csv

# Generate code and save to a directory
csv-dataclass-gen data.csv --output ./generated_models/

# Specify a custom class name and sample size for type inference
csv-dataclass-gen data.csv --name my_custom_data --sample-size 500

CLI Help Message:

Usage: csv-dataclass-gen [OPTIONS] INPUT_FILE

  Generate dataclass and loader code from CSV files.

Options:
  -o, --output TEXT          Output directory for generated files. "-" outputs
                             the result to stdout.
  -s, --sample-size INTEGER  Number of rows to sample for type inference
  -n, --name TEXT            Alternative name for the generated name. Snake
                             case / spaced words is recommended.
  --help                     Show this message and exit.

Arguments:

  • INPUT_FILE: Path to the CSV/TSV file (required).

Example Generated Code

Given a CSV like users.csv:

id,user_name,score
1,alice,95.5
2,bob,88.0

The generator will produce:

from dataclasses import dataclass
from pathlib import Path
from typing import Iterator
import csv

@dataclass
class Users:
    id: int  # Original: "id"
    user_name: str  # Original: "user_name"
    score: float  # Original: "score"

def load_users(csv_path: Path, max_rows: int | None = None, delimiter: str = ',') -> Iterator[Users]:
    # ... loading logic ...
    pass

Development

Dependencies

This project uses uv for dependency management, but it can also be installed using standard tools.

uv sync --all-groups

Running Tests

We use pytest for testing.

uv run pytest

Tests include:

  • tests/test_name_sanitizer.py: Logic for sanitizing names into different formats.
  • tests/test_type_inferrer.py: Logic for detecting data types.
  • tests/test_csv_analyzer.py: Logic for CSV structure analysis.
  • tests/test_code_gen_e2e.py: End-to-end tests that generate code and verify it by reconstructing the original CSV.

License

MIT License – see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv_dataclass_gen-0.1.3.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csv_dataclass_gen-0.1.3-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file csv_dataclass_gen-0.1.3.tar.gz.

File metadata

  • Download URL: csv_dataclass_gen-0.1.3.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for csv_dataclass_gen-0.1.3.tar.gz
Algorithm Hash digest
SHA256 612aa64d2f264f981d0a82588ed537b484361513d4dd9489369e12eb48c8623c
MD5 db093d4d20b555c42be31bc8f417d363
BLAKE2b-256 9e7441a9c919d810c25505c68351d873270676f526bef7ac74201f875de725a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for csv_dataclass_gen-0.1.3.tar.gz:

Publisher: publish.yml on khwong-c/csv-dataclass-gen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file csv_dataclass_gen-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for csv_dataclass_gen-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 394ebdf1253243920b4c954e1ed2bb8463311ffee643939c3a7173226ec132cb
MD5 e4ba0caa30dc065dbd8e9c999dc407fd
BLAKE2b-256 5e73bc69d5c7924a785962612d2267ef376ba46044c4dd1dc36f8e4fef16dc03

See more details on using hashes here.

Provenance

The following attestation bundles were made for csv_dataclass_gen-0.1.3-py3-none-any.whl:

Publisher: publish.yml on khwong-c/csv-dataclass-gen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page