Skip to main content

Small, typed data science helpers for serialization, atomic file writes, and dataframe-oriented record normalization.

Project description

dr-ds

Small, typed data science helpers for serialization, atomic file writes, and dataframe-oriented record normalization.

Install

uv add dr-ds

dr-ds currently targets Python 3.12+.

Included Helpers

dr_ds.atomic_io provides:

  • dump_json_atomic
  • atomic_write_jsonl
  • atomic_write_parquet_records

dr_ds.serialization provides:

  • serialize_timestamp
  • utc_now_iso
  • to_jsonable
  • convert_large_ints
  • parse_jsonish

dr_ds.parquet provides:

  • records_to_parquet_frame
  • parquet_frame_to_records

These helpers are aimed at a common pattern in data workflows:

  • start with list[dict[str, Any]] records
  • normalize nested JSON-like columns into strings for dataframe/parquet compatibility
  • recover those structured columns on read

Atomic IO Example

from pathlib import Path

from dr_ds.atomic_io import dump_json_atomic
from dr_ds.serialization import to_jsonable

payload = to_jsonable(
    {
        "metrics": {"loss": 0.42},
        "tags": {"baseline", "v1"},
    }
)

dump_json_atomic(Path("summary.json"), payload)

Parquet Example

from dr_ds.parquet import parquet_frame_to_records, records_to_parquet_frame

records = [
    {
        "run_id": "abc123",
        "metrics": {"loss": 0.42, "token_count": 2**35},
    }
]

frame = records_to_parquet_frame(records, json_columns={"metrics"})
restored = parquet_frame_to_records(frame, json_columns={"metrics"})

records_to_parquet_frame prepares records for dataframe/parquet workflows. It does not write parquet files directly.

License

MIT

Development

Run the standard checks before committing:

uv run ruff format .
uv run ruff check .
uv run ty check
uv run pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dr_ds-0.1.2.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dr_ds-0.1.2-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file dr_ds-0.1.2.tar.gz.

File metadata

  • Download URL: dr_ds-0.1.2.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.0

File hashes

Hashes for dr_ds-0.1.2.tar.gz
Algorithm Hash digest
SHA256 25114d157106d2d2562d3263748b3097be937114f1a86f0d2538e83ad345aa93
MD5 8d223073e9297bc92daf118a1bec9c50
BLAKE2b-256 658915df28c19a2faa6b4844d529e6ea4944a807a42899ed65431cdd91258605

See more details on using hashes here.

File details

Details for the file dr_ds-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: dr_ds-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 6.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.0

File hashes

Hashes for dr_ds-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d4fc17718bc26561d58ace959063f25c195c1e9058930c7ec497e96fd03adf15
MD5 7e7348518b4afb4ab7b77394a5642611
BLAKE2b-256 44ccfe7c74ee1b1def3dbcf7fe009c4ca3b8c6e5b7c1a3bacd5b887cbd63da30

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page