Skip to main content

Generic utility functions for text formatting, string operations, and type conversions.

Project description

dsr-utils

PyPI version Python versions License Changelog

Utility functions and helpers for common data science tasks, including datetime parsing, formatting, tables, and plotting helpers.

Version 1.5.0: Introduced a reflection module featuring safe_call, a dynamic utility for executing functions with automatic parameter filtering based on signature inspection.

Features

  • Datetime utilities: Parse and enrich timestamps with vectorized pandas integration.
  • Formatting utilities: Numeric, currency, percentage, and datetime formatters.
  • Table helpers: High-precision layout engine with pagination support.
  • Matplotlib helpers: Headless-friendly bounding box and renderer utilities.
  • String utilities: Recursive case conversion (snake, pascal, camel, etc.).
  • Type utilities: Robust standardization of scalars and collections into flat lists.
  • Hashing Utilities: Generate deterministic fingerprints for pandas DataFrames, NumPy arrays, and large files using memory-efficient SHA-256 and joblib hashing.
  • Reflection Utilities: Programmatically inspect function signatures and safely execute callables by filtering incompatible keyword arguments.

Installation

pip install dsr-utils

Usage

General Usage

import pandas as pd
from dsr_utils.datetime import parse_datetime
from dsr_utils.formatting import FloatFormat
from dsr_utils.tables import Table, TableColumn, TableColumnStyle, render_table

# Datetime parsing with Pandas 2.0+ mixed-format support
ts = pd.Timestamp("2025-10-01 12:34:56")
# (Usage of parse_datetime utility here)

# Formatting utilities
fmt = FloatFormat(precision=2)
print(fmt.format_value(1234.567))

# Table helpers (v1.3.0 constructor requirements)
df = pd.DataFrame({"Metric": ["Trips"], "Value": ["1,200"]})
style = TableColumnStyle()
table = Table(
    data=df,
    max_table_height=0.5,
    mid_x=0.5,
    top_y=0.8,
    fontsize=11,
    columns={
        "Metric": TableColumn(detail_style=style, header_style=style),
        "Value": TableColumn(detail_style=style, header_style=style)
    }
)

Data Integrity & Hashing

import pandas as pd
from dsr_utils.hashing import calculate_object_hash, calculate_file_hash
from pathlib import Path

# Generate a deterministic hash for a DataFrame
df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
df_hash = calculate_object_hash(df)
print(f"DataFrame Fingerprint: {df_hash}")

# Calculate hash for a raw data file without loading it entirely into memory
# Ideal for large CSVs on memory-constrained systems like a Mac-mini
file_path = Path("data/raw/adult.csv")
file_hash = calculate_file_hash(file_path)
print(f"File Fingerprint: {file_hash}")

Dynamic Function Execution

from dsr_utils.reflection import safe_call

def process_data(data, mode="fast", verbose=False):
    return f"Processing {data} in {mode} mode"

# A dictionary containing both valid and invalid parameters
raw_config = {
    "mode": "thorough",
    "verbose": True,
    "unsupported_param": "ignore_me"
}

# safe_call filters the config and returns the result + rejected keys
result, rejected = safe_call(process_data, raw_config, data="MyDataset")

print(result)            # Output: Processing MyDataset in thorough mode
print(rejected.keys())   # Output: dict_keys(['unsupported_param'])

Requirements

  • Python >= 3.10
  • numpy >= 2.0.0
  • pandas >= 2.0.0
  • joblib >= 1.4.0
  • matplotlib (required for matplotlib helpers)

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dsr_utils-1.5.0.tar.gz (54.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dsr_utils-1.5.0-py3-none-any.whl (44.6 kB view details)

Uploaded Python 3

File details

Details for the file dsr_utils-1.5.0.tar.gz.

File metadata

  • Download URL: dsr_utils-1.5.0.tar.gz
  • Upload date:
  • Size: 54.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dsr_utils-1.5.0.tar.gz
Algorithm Hash digest
SHA256 936befa5cff7ec43cdd2d1ff7e31d7bced54675d349a89e5e5de589f6e6c1095
MD5 b2d64f70f54046938c3b132bd1c2a8c8
BLAKE2b-256 74520890f4bc7903c188761b58eaccdeeb2fd88a6bdfb030563f7764b62c9a81

See more details on using hashes here.

Provenance

The following attestation bundles were made for dsr_utils-1.5.0.tar.gz:

Publisher: python-publish.yml on scottroberts140/dsr-utils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dsr_utils-1.5.0-py3-none-any.whl.

File metadata

  • Download URL: dsr_utils-1.5.0-py3-none-any.whl
  • Upload date:
  • Size: 44.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dsr_utils-1.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9ca14d46fa70d9934f70be8a426e4a09c1fa3bf8441cc1709a9066f9453ae490
MD5 42de03b0f39e7b3211e4584e7027d1b4
BLAKE2b-256 1332dc8bb7e2a64dfa1d9baead7fc9c418bef02b7835e1da2dbcb57fb3b7b87a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dsr_utils-1.5.0-py3-none-any.whl:

Publisher: python-publish.yml on scottroberts140/dsr-utils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page