Small, typed data science helpers for serialization, atomic file writes, and dataframe-oriented record normalization.

These details have not been verified by PyPI

Project links

Project description

dr-ds

Small, typed data science helpers for serialization, atomic file writes, and dataframe-oriented record normalization.

Install

uv add dr-ds

dr-ds currently targets Python 3.12+.

Included Helpers

dr_ds.atomic_io provides:

dump_json_atomic
atomic_write_jsonl
atomic_write_parquet_records

dr_ds.serialization provides:

serialize_timestamp
utc_now_iso
to_jsonable
convert_large_ints
parse_jsonish

dr_ds.parquet provides:

records_to_parquet_frame
parquet_frame_to_records

These helpers are aimed at a common pattern in data workflows:

start with list[dict[str, Any]] records or similarly loose Python data
normalize nested containers and plain Python objects into JSON-safe values
persist those records atomically or adapt them for dataframe/parquet workflows
recover structured JSON-like columns on read without rebuilding ad hoc parsing logic

Design Goals

Prefer small, reusable utilities with stable behavior over framework-heavy abstractions.
Be explicit about lossy or opinionated conversions.
Keep serialization helpers deterministic so downstream tests and diffs stay readable.
Make the common data-science path easy: Python dicts in, JSON/parquet-safe data out.

Serialization Contracts

to_jsonable is the main normalization helper for nested Python values.

datetime values become UTC ISO 8601 strings.
Mapping keys are stringified.
Tuples become lists.
Sets become deterministically ordered lists.
Plain Python objects are serialized from their public, non-callable attributes.
Recursive references are replaced with the literal string "<recursion>".
Values that cannot be meaningfully introspected fall back to str(value).

convert_large_ints is intentionally narrower:

it recursively converts integers whose absolute value exceeds DEFAULT_MAX_INT into floats
it preserves tuple and set container types
it is mainly intended to keep dataframe/parquet pipelines practical when very large integers appear in nested payloads

parse_jsonish only parses strings that are valid JSON. Invalid JSON, blank strings, and non-string values are returned unchanged.

Atomic IO Example

from pathlib import Path

from dr_ds.atomic_io import atomic_write_jsonl, dump_json_atomic
from dr_ds.serialization import to_jsonable

payload = to_jsonable(
    {
        "metrics": {"loss": 0.42},
        "tags": {"baseline", "v1"},
        "owner": {"name": "baseline-bot", "id": 7},
    }
)

dump_json_atomic(Path("summary.json"), payload)

atomic_write_jsonl(
    Path("runs.jsonl"),
    [
        {"run_id": "run-1", "summary": payload},
        {"run_id": "run-2", "summary": {"loss": 0.39}},
    ],
)

All atomic writers use a sibling temporary file plus os.replace, then fsync the parent directory so the rename is durably recorded.

Parquet Example

from dr_ds.parquet import parquet_frame_to_records, records_to_parquet_frame

records = [
    {
        "run_id": "abc123",
        "metrics": {"loss": 0.42, "token_count": 2**35},
    }
]

frame = records_to_parquet_frame(records, json_columns={"metrics"})
restored = parquet_frame_to_records(frame, json_columns={"metrics"})

records_to_parquet_frame prepares records for dataframe/parquet workflows. It does not write parquet files directly.

Behavior to rely on:

columns listed in json_columns are normalized through to_jsonable
large integers nested inside those JSON columns are converted with convert_large_ints
top-level large integers in non-JSON columns are also softened to floats
parquet_frame_to_records restores JSON columns with parse_jsonish and converts dataframe null-like values in those columns back to None

Coercion Helpers

coerce_int, coerce_number, and coerce_float are intentionally forgiving.

invalid inputs return None instead of raising
booleans are rejected even though Python considers them integers
coerce_number preserves integral numeric values as int
coerce_float is the lossy "give me a float if possible" variant

License

MIT

Development

Run the standard checks before committing:

uv run ruff format .
uv run ruff check .
uv run ty check
uv run pytest

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.4

Apr 14, 2026

0.1.3

Apr 12, 2026

0.1.2

Apr 12, 2026

0.1.1

Apr 11, 2026

0.1.0

Apr 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dr_ds-0.1.4.tar.gz (34.7 kB view details)

Uploaded Apr 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dr_ds-0.1.4-py3-none-any.whl (9.8 kB view details)

Uploaded Apr 14, 2026 Python 3

File details

Details for the file dr_ds-0.1.4.tar.gz.

File metadata

Download URL: dr_ds-0.1.4.tar.gz
Upload date: Apr 14, 2026
Size: 34.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.7

File hashes

Hashes for dr_ds-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`af9eb397710163137318bfeaaa6c3a41b998b93032efe4894bfd49196a15121d`
MD5	`cb97c3bf963841a18b512265c1c5d5d8`
BLAKE2b-256	`3c6d02f9f0e88d81f507c9d2f5f7c3d6c870cfcfdaa42cdb55c518aa6501949e`

See more details on using hashes here.

File details

Details for the file dr_ds-0.1.4-py3-none-any.whl.

File metadata

Download URL: dr_ds-0.1.4-py3-none-any.whl
Upload date: Apr 14, 2026
Size: 9.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.7

File hashes

Hashes for dr_ds-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a0a01680def4f36ce92ff53d753dc2a83c9cfc9d9bba0952af67c83a0dd66d59`
MD5	`2e638685330ea60fee4fba4a016375a5`
BLAKE2b-256	`72f9a518606344719327fd2e61e8554eafd6aaca34a32d25c7bf2335172d16a4`

See more details on using hashes here.

dr-ds 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dr-ds

Install

Included Helpers

Design Goals

Serialization Contracts

Atomic IO Example

Parquet Example

Coercion Helpers

License

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes