Skip to main content

Generate Python dataclasses, loaders, from CSV/TSV files

Project description

CSV Dataclass Generator

Generate Python dataclasses and loader functions from CSV/TSV files.

Features

  • Automatic Type Inference: Detects int, float, and str types based on a sample of rows.
  • Dialect Detection: Automatically identifies CSV delimiters (including TSV).
  • Name Sanitization: Converts CSV column headers into valid Python identifiers.
  • Iterator Loading: Generates a loading generator that yields dataclass instances.

Installation

pip install csv-dataclass-gen

Usage

Command Line Interface (CLI)

The package provides a csv-dataclass-gen command.

# Generate code to stdout
csv-dataclass-gen data.csv

# Generate code and save to a directory
csv-dataclass-gen data.csv --output ./generated_models/

# Specify a custom class name and sample size for type inference
csv-dataclass-gen data.csv --name my_custom_data --sample-size 500

CLI Help Message:

Usage: csv-dataclass-gen [OPTIONS] INPUT_FILE

  Generate dataclass and loader code from CSV files.

Options:
  -o, --output TEXT          Output directory for generated files. "-" outputs
                             the result to stdout.
  -s, --sample-size INTEGER  Number of rows to sample for type inference
  -n, --name TEXT            Alternative name for the generated name. Snake
                             case / spaced words is recommended.
  --help                     Show this message and exit.

Arguments:

  • INPUT_FILE: Path to the CSV/TSV file (required).

Example Generated Code

Given a CSV like users.csv:

id,user_name,score
1,alice,95.5
2,bob,88.0

The generator will produce:

from dataclasses import dataclass
from pathlib import Path
from typing import Iterator
import csv

@dataclass
class Users:
    id: int  # Original: "id"
    user_name: str  # Original: "user_name"
    score: float  # Original: "score"

def load_users(csv_path: Path, max_rows: int | None = None, delimiter: str = ',') -> Iterator[Users]:
    # ... loading logic ...
    pass

Development

Code Template

The generator comes with a Dataclass and Loader template.

Dependencies

This project uses uv for dependency management.

uv sync --all-groups

Running Tests

We use pytest for testing.

uv run pytest

Tests include:

  • tests/test_name_sanitizer.py: Logic for sanitizing names into different formats.
  • tests/test_type_inferrer.py: Logic for detecting data types.
  • tests/test_csv_analyzer.py: Logic for CSV structure analysis.
  • tests/test_code_gen_e2e.py: End-to-end tests that generate code and verify it by reconstructing the original CSV.

License

MIT License – see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv_dataclass_gen-1.0.1.tar.gz (31.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csv_dataclass_gen-1.0.1-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file csv_dataclass_gen-1.0.1.tar.gz.

File metadata

  • Download URL: csv_dataclass_gen-1.0.1.tar.gz
  • Upload date:
  • Size: 31.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for csv_dataclass_gen-1.0.1.tar.gz
Algorithm Hash digest
SHA256 9656fa09301a9db1f3456c4b93cdcece52d40ca9eebf332ec7e9c11af2c5a405
MD5 c16d34d6e4576f94679074b64d350fc7
BLAKE2b-256 bf07205ba7e46f8ec986f946b30b94e440ce7b478ea0da1691f60f10758fb218

See more details on using hashes here.

Provenance

The following attestation bundles were made for csv_dataclass_gen-1.0.1.tar.gz:

Publisher: publish.yml on khwong-c/csv-dataclass-gen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file csv_dataclass_gen-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for csv_dataclass_gen-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4c03dcd17d287b12e432a99f9efdb3793c6fdafd06174cf9867c928009621e79
MD5 e3dce29f99b68cb9f10d6420911cc15a
BLAKE2b-256 c9e1d12975422660f99e2073146f2808588d5ac71000e4d8e3d41cf5ef0bf42c

See more details on using hashes here.

Provenance

The following attestation bundles were made for csv_dataclass_gen-1.0.1-py3-none-any.whl:

Publisher: publish.yml on khwong-c/csv-dataclass-gen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page