Skip to main content

Small, typed data science helpers for serialization, atomic file writes, and dataframe-oriented record normalization.

Project description

dr-ds

Small, typed data science helpers for serialization, atomic file writes, and dataframe-oriented record normalization.

Install

uv add dr-ds

dr-ds currently targets Python 3.12+.

Included Helpers

dr_ds.atomic_io provides:

  • dump_json_atomic
  • atomic_write_jsonl
  • atomic_write_parquet_records

dr_ds.serialization provides:

  • serialize_timestamp
  • utc_now_iso
  • to_jsonable
  • convert_large_ints
  • parse_jsonish

dr_ds.parquet provides:

  • records_to_parquet_frame
  • parquet_frame_to_records

These helpers are aimed at a common pattern in data workflows:

  • start with list[dict[str, Any]] records
  • normalize nested JSON-like columns into strings for dataframe/parquet compatibility
  • recover those structured columns on read

Atomic IO Example

from pathlib import Path

from dr_ds.atomic_io import dump_json_atomic
from dr_ds.serialization import to_jsonable

payload = to_jsonable(
    {
        "metrics": {"loss": 0.42},
        "tags": {"baseline", "v1"},
    }
)

dump_json_atomic(Path("summary.json"), payload)

Parquet Example

from dr_ds.parquet import parquet_frame_to_records, records_to_parquet_frame

records = [
    {
        "run_id": "abc123",
        "metrics": {"loss": 0.42, "token_count": 2**35},
    }
]

frame = records_to_parquet_frame(records, json_columns={"metrics"})
restored = parquet_frame_to_records(frame, json_columns={"metrics"})

records_to_parquet_frame prepares records for dataframe/parquet workflows. It does not write parquet files directly.

License

MIT

Development

Run the standard checks before committing:

uv run ruff format .
uv run ruff check .
uv run ty check
uv run pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dr_ds-0.1.3.tar.gz (28.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dr_ds-0.1.3-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file dr_ds-0.1.3.tar.gz.

File metadata

  • Download URL: dr_ds-0.1.3.tar.gz
  • Upload date:
  • Size: 28.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.0

File hashes

Hashes for dr_ds-0.1.3.tar.gz
Algorithm Hash digest
SHA256 95f4f4340fbb7b244718264107a5fddda97f191747c28c7124100064157247ba
MD5 3c89b0d953ab5ac296989cbbd2b165d0
BLAKE2b-256 e4de085f9401c1893829f90809ca9735b9acd4cbbad066c9ba68a9cfe076630b

See more details on using hashes here.

File details

Details for the file dr_ds-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: dr_ds-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.0

File hashes

Hashes for dr_ds-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d32232f912d8a9e3adb18e50c54a2ee895942a0c4b2765883b199af9f10f8f70
MD5 04326dcb59761c0725f7495e6ac099fd
BLAKE2b-256 c6c5b26c02ecfb5902974fa30dd1d108a2c4168514b1d0af1b3d83f7e109be4b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page