Skip to main content

A Tiny file IO utility library for Python powered by Pydantic

Project description

PydanticIO Logo

PyPI Version Python Versions License

A tiny file IO utility library for Python powered by Pydantic. This library is a port of the Rust library SerdeIO.

Features

  • Type-safe: Read and write Pydantic models with full type inference
  • Format support: CSV, JSON, JSON Lines, TOML, YAML, MessagePack, and CBOR (optional)
  • Auto-detection: Automatically detects format from file extension
  • Simple API: Intuitive functions for single records and lists
  • Zero dependencies: Core library only requires Pydantic

Installation

# Standard distribution
pip install pydanticio

# With YAML support
pip install pydanticio[yaml]

# With MessagePack support
pip install pydanticio[messagepack]

# With TOML support
pip install pydanticio[toml]

# With CBOR support
pip install pydanticio[cbor]

Quick Start

from pydantic import BaseModel
from pydanticio import read_records_from_file, write_records_to_file

class User(BaseModel):
    name: str
    age: int

# Read from any supported format (auto-detected from extension)
users = read_records_from_file("users.csv", User)

# Or specify format explicitly (overrides file extension)
users = read_records_from_file("data.txt", User, data_format="csv")

# Write to any supported format
write_records_to_file("output.json", users)

Supported Formats

Format File Extensions Single Record List of Records
CSV .csv No Yes
JSON .json Yes Yes
JSON Lines .jsonl, .jl, .jsl, .json_lines No Yes
MessagePack .msgpack Yes Yes
CBOR .cbor Yes Yes
TOML .toml Yes No
YAML .yaml, .yml Yes Yes

All text-based formats use UTF-8 encoding.

Backend Dependencies

The following table lists the Python packages used as backends for each supported format:

Format Backend Package Required
CSV Built-in (csv) No
JSON Built-in (json) No
JSON Lines Built-in (json) No
MessagePack msgpack Optional
CBOR cbor2 Optional
TOML tomlkit Optional
YAML pyyaml Optional

Newline Handling

All text-based formats handle newlines automatically:

  • On read: Any newline style (\n, \r\n, or \r) is accepted and normalized
  • On write: Each format uses its appropriate line ending per specification
Format Line Ending Notes
CSV \r\n RFC 4180
JSON \n
JSON Lines \n RFC 7464
TOML Platform
YAML \n

API Reference

Reading

Function Description Supported Formats
read_record_from_reader(reader, model, format) Read single record from BinaryIO JSON, MessagePack, CBOR, TOML, YAML
read_record_from_file(path, model, data_format=None) Read single record from file path JSON, MessagePack, CBOR, TOML, YAML
read_records_from_reader(reader, model, format) Read list of records from BinaryIO All formats except for TOML
read_records_from_file(path, model, data_format=None) Read list of records from file path All formats except for TOML

Writing

Function Description Supported Formats
write_record_to_writer(writer, record, format) Write single record to BinaryIO JSON, MessagePack, CBOR, TOML, YAML
write_record_to_file(path, record, data_format=None) Write single record to file path JSON, MessagePack, CBOR, TOML, YAML
write_records_to_writer(writer, records, format) Write list of records to BinaryIO All formats except for TOML
write_records_to_file(path, records, data_format=None) Write list of records to file path All formats except for TOML

Format Specification

When using *_from_file or *_to_file functions, you can optionally specify the data format explicitly using the data_format parameter. If not specified, the format is automatically detected from the file extension.

from pydanticio import read_records_from_file, write_records_to_file

# Auto-detects CSV format from .csv extension
users = read_records_from_file("data/users.csv", User)

# Explicit format overrides file extension
users = read_records_from_file("data/file.xyz", User, data_format="csv")
write_records_to_file("data/output.txt", users, data_format="json")

Valid format values:

Value Description
"json" JSON format
"yaml" YAML format
"messagepack" MessagePack
"toml" TOML format (single record only)
"csv" CSV format (records only)
"json_lines" JSON Lines format (records only)

When data_format is None (default), the format is automatically detected from the file extension. When explicitly specified, it overrides the automatic detection.

Explicit Format Specification

Use the data_format parameter to override automatic format detection from file extensions:

from pydantic import BaseModel
from pydanticio import (
    read_records_from_file,
    write_records_to_file,
    read_record_from_file,
    write_record_to_file,
)

class User(BaseModel):
    name: str
    age: int

# Override file extension - read CSV from .txt file
users = read_records_from_file("data/users.txt", User, data_format="csv")

# Write JSON to file with non-standard extension
write_records_to_file("data/export.xyz", users, data_format="json")

# Single record with explicit format
class Config(BaseModel):
    setting: str
    value: int

config = read_record_from_file("config.data", Config, data_format="yaml")
write_record_to_file("config.out", config, data_format="toml")

This is useful when:

  • Working with files that have non-standard extensions
  • Converting between formats while preserving original file
  • Ensuring consistent format regardless of file naming

Examples

Reading and Writing Lists

from pydantic import BaseModel
from pydanticio import read_records_from_file, write_records_to_file

class User(BaseModel):
    name: str
    age: int

# Convert between formats
users = read_records_from_file("users.csv", User)
write_records_to_file("users.json", users)

Reading Single Records

from pydantic import BaseModel
from pydanticio import read_record_from_file

class Config(BaseModel):
    name: str
    version: int
    enabled: bool

config = read_record_from_file("config.toml", Config)
print(config.name, config.version)

Using Streams

from pydantic import BaseModel
from pydanticio import read_records_from_reader, write_records_to_writer

class Item(BaseModel):
    id: int
    value: str

# Read from a file stream
with open("data.json", "rb") as f:
    items = read_records_from_reader(f, Item)

# Write to a BytesIO stream
from io import BytesIO
buffer = BytesIO()
write_records_to_writer(buffer, items, "json")

Converting Between Formats

# Command line usage example
python examples/convert_format.py input.csv output.json

See examples/convert_format.py for the full source code.

Requirements

  • Python 3.12+
  • Pydantic 2.5.0+

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydanticio-0.5.0.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydanticio-0.5.0-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file pydanticio-0.5.0.tar.gz.

File metadata

  • Download URL: pydanticio-0.5.0.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pydanticio-0.5.0.tar.gz
Algorithm Hash digest
SHA256 85ac24820c814f9687e2e939293ec0929c45d787ea90a988bc82610e13e5a327
MD5 eac05372813cb6eb522e494cd787bc25
BLAKE2b-256 75b038c253b525214de99965c5d1780d2c1976fe74a55844ad2d15fba4fb29e4

See more details on using hashes here.

File details

Details for the file pydanticio-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: pydanticio-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for pydanticio-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7380707c776e7da89eb3aba8fbe7022f5bfecf7784f60a6ce87caa08cc2dc4a6
MD5 7a866001dbe367028358ce73491928a1
BLAKE2b-256 2526415a549d9f2796b9b6d1dd1cd67b72f5fa2735bb6255874074c507d6742e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page