Read, validate, and write CSV files using Pydantic models with dydactic.
Project description
csvalchemy
A Python package for reading and writing CSV files using Pydantic models.
Overview
csvalchemy provides a clean interface for validating CSV data against Pydantic models, handling errors gracefully, and writing validated results back to CSV files. It integrates with dydactic for robust validation of data records.
Features
- CSV Reading: Read CSV files and validate each row against Pydantic models
- Error Handling: Continue processing even when individual rows fail validation
- Type Safety: Full type hints and validation using Pydantic
- CSV Writing: Write validated results back to CSV files
- Integration: Built on dydactic for reliable validation
Dependencies
- Python: 3.10 or higher
- pydantic: >=2.9.2 (Data validation using Python type annotations)
- dydactic: >=0.2.0 (Validation engine - requires Python 3.10+)
- python-dateutil: >=2.8.0 (DateTime parsing)
Installation
pip install csvalchemy
Quick Start
from pydantic import BaseModel
from csvalchemy import read
from io import StringIO
# Define your model
class Person(BaseModel):
name: str
age: int
email: str | None = None
# Sample CSV content
csv_content = """name,age,email
Alice,30,alice@example.com
Bob,25,bob@example.com
Charlie,35,charlie@example.com
"""
# Read and validate CSV
with StringIO(csv_content) as f:
for result in read(f, Person):
if result.error:
print(f"Validation error: {result.error}")
else:
print(f"Valid person: {result.result.name}, age {result.result.age}")
Output:
Valid person: Alice, age 30
Valid person: Bob, age 25
Valid person: Charlie, age 35
Examples
Error Handling
csvalchemy continues processing even when individual rows fail validation:
from pydantic import BaseModel
from csvalchemy import read
from io import StringIO
class Person(BaseModel):
name: str
age: int
email: str | None = None
# CSV with some invalid rows
csv_content = """name,age,email
Alice,30,alice@example.com
Bob,not_a_number,bob@example.com
Charlie,35,charlie@example.com
Diana,not_a_number,diana@example.com
"""
with StringIO(csv_content) as f:
valid_count = 0
error_count = 0
for result in read(f, Person):
if result.error:
error_count += 1
print(f"Error on row {error_count}: {result.error}")
else:
valid_count += 1
print(f"Valid: {result.result.name}")
print(f"\nSummary: {valid_count} valid, {error_count} errors")
Output:
Valid: Alice
Error on row 1: 1 validation error for Person
age
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='not_a_number', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/int_parsing
Valid: Charlie
Error on row 2: 1 validation error for Person
age
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='not_a_number', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/int_parsing
Summary: 2 valid, 2 errors
Writing Validated CSV
Write only validated results back to CSV:
from pydantic import BaseModel
from csvalchemy import read
from io import StringIO
class Product(BaseModel):
id: int
name: str
price: float
in_stock: bool
# Input CSV
input_csv = """id,name,price,in_stock
1,Widget,19.99,True
2,Gadget,29.99,False
3,Invalid,not_a_number,True
4,Thing,39.99,True
"""
# Read and validate
input_file = StringIO(input_csv)
validator = read(input_file, Product)
# Write validated results to new CSV
output_file = StringIO()
# Recreate validator since iterator was consumed
input_file2 = StringIO(input_csv)
validator2 = read(input_file2, Product)
writer = validator2.csv_writer(output_file)
# Consume writer to trigger CSV writing
for result in writer:
if result.error:
print(f"Skipped invalid row: {result.error}")
else:
print(f"Wrote: {result.result.name}")
# Show output CSV
output_file.seek(0)
print("\n=== Output CSV ===")
print(output_file.read())
Output:
Wrote: Widget
Wrote: Gadget
Skipped invalid row: 1 validation error for Product
price
Input should be a valid number, unable to parse string as a number [type=float_parsing, input_value='not_a_number', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/float_parsing
Wrote: Thing
=== Output CSV ===
id,name,price,in_stock
1,Widget,19.99,True
2,Gadget,29.99,False
4,Thing,39.99,True
Using Validator Directly
Validate data not from CSV files:
from pydantic import BaseModel
from csvalchemy import Validator
import dydactic.options
class Person(BaseModel):
name: str
age: int
email: str | None = None
# Data not from CSV
records = [
{"name": "Alice", "age": "30", "email": "alice@example.com"},
{"name": "Bob", "age": "not_a_number", "email": "bob@example.com"},
{"name": "Charlie", "age": "35"},
]
# Standard validation
print("=== Using Validator directly ===")
validator = Validator(iter(records), Person)
for result in validator:
if result.error:
print(f"Error: {result.error}")
else:
print(f"Valid: {result.result.name}, age {result.result.age}")
# Skip invalid records
print("\n=== Using SKIP error option ===")
validator_skip = Validator(
iter(records),
Person,
error_option=dydactic.options.ErrorOption.SKIP
)
valid_results = list(validator_skip)
print(f"Got {len(valid_results)} valid results (invalid ones skipped)")
Output:
=== Using Validator directly ===
Valid: Alice, age 30
Error: 1 validation error for Person
age
Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='not_a_number', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/int_parsing
Valid: Charlie, age 35
=== Using SKIP error option ===
Got 2 valid results (invalid ones skipped)
Integration with dydactic
csvalchemy uses dydactic as its core validation
engine. The Validator and ValidatorIterator classes wrap dydactic.validate() to provide
a consistent API for CSV data validation.
How it works
- CSV Reading:
read()creates aCSVReaderValidatorthat reads CSV rows using Python'scsv.DictReader - Validation: Each row is validated using
dydactic.validate(), which handles Pydantic model validation - Error Handling: Validation errors are captured without stopping the iteration
- Result Mapping: dydactic's result objects are mapped to csvalchemy's
Resulttype for consistent API
Benefits
- Leverages dydactic's robust validation handling
- Independent validation of each record (errors don't stop processing)
- Type-safe error handling with clear error messages
- Compatible with dydactic's validation strategies
- Configurable error handling (RETURN, RAISE, or SKIP)
- Support for strict validation and attribute-based validation
Configuration Options
The Validator class supports dydactic's configuration options:
- error_option: Control how validation errors are handled:
RETURN(default): Errors are returned inResult.errorRAISE: Exceptions are raised immediately on validation errorsSKIP: Records with errors are skipped entirely
- strict: Enable strict Pydantic validation
- from_attributes: Validate from object attributes
Example:
from pydantic import BaseModel
from csvalchemy import Validator
import dydactic.options
class Person(BaseModel):
name: str
age: int
records = [
{"name": "Alice", "age": "30"},
{"name": "Bob", "age": "invalid"},
{"name": "Charlie", "age": "35"},
]
# Default: RETURN errors
validator_return = Validator(iter(records), Person)
results_return = list(validator_return)
print(f"RETURN mode: {len(results_return)} results (including errors)")
# SKIP invalid records
validator_skip = Validator(
iter(records),
Person,
error_option=dydactic.options.ErrorOption.SKIP
)
results_skip = list(validator_skip)
print(f"SKIP mode: {len(results_skip)} results (errors skipped)")
Output:
RETURN mode: 3 results (including errors)
SKIP mode: 2 results (errors skipped)
Architecture Notes
Casting and Validation
csvalchemy provides two approaches to validation:
-
Full Validation (Recommended): Use
Validatororread()which leverage dydactic's complete validation pipeline including dydactic's casting functionality. This is the primary and recommended approach for CSV validation. -
Standalone Casting: The
cast.pymodule provides casting utilities similar todydactic.cast. This module is kept for:- Standalone use cases that don't require full dydactic validation
- Direct class instantiation without Pydantic models
- Testing scenarios
Note: The main validation flow uses dydactic's casting internally, so cast.py is not used in the primary validation pipeline.
Requirements
- Python 3.10+ (required by dydactic)
- See
pyproject.tomlfor complete dependency list
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file csvalchemy-0.1.0.tar.gz.
File metadata
- Download URL: csvalchemy-0.1.0.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26661d7122a643dc3001284c5a9e67d5a8fcc38ed9bc1a7b6a7f3c564aa5c87f
|
|
| MD5 |
a3c09dbec986f654b051f8aaa80df5fd
|
|
| BLAKE2b-256 |
b33753a2bd641738771dc7902f4b2d4e506f1ad13b27cf8d9f375e62891496b1
|
File details
Details for the file csvalchemy-0.1.0-py3-none-any.whl.
File metadata
- Download URL: csvalchemy-0.1.0-py3-none-any.whl
- Upload date:
- Size: 23.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
705330849911918d2734a9051a1cac416cdd8cfd81aac4865dd1acd182a5ba9e
|
|
| MD5 |
c88d43ef48e4b32311920a43e6c594e0
|
|
| BLAKE2b-256 |
155b6292e3b1e8046c2a5aa0fdca7d2eb0939cbbb073510aa5a4dac58e0a4243
|