Generic utility functions for text formatting, string operations, and type conversions.
Project description
dsr-utils
Utility functions and helpers for common data science tasks, including datetime parsing, formatting, tables, and plotting helpers.
Version 1.5.0: Introduced a reflection module featuring safe_call, a dynamic utility for executing functions with automatic parameter filtering based on signature inspection.
Features
- Datetime utilities: Parse and enrich timestamps with vectorized pandas integration.
- Formatting utilities: Numeric, currency, percentage, and datetime formatters.
- Table helpers: High-precision layout engine with pagination support.
- Matplotlib helpers: Headless-friendly bounding box and renderer utilities.
- String utilities: Recursive case conversion (snake, pascal, camel, etc.).
- Type utilities: Robust standardization of scalars and collections into flat lists.
- Hashing Utilities: Generate deterministic fingerprints for pandas DataFrames, NumPy arrays, and large files using memory-efficient SHA-256 and joblib hashing.
- Reflection Utilities: Programmatically inspect function signatures and safely execute callables by filtering incompatible keyword arguments.
Installation
pip install dsr-utils
Usage
General Usage
import pandas as pd
from dsr_utils.datetime import parse_datetime
from dsr_utils.formatting import FloatFormat
from dsr_utils.tables import Table, TableColumn, TableColumnStyle, render_table
# Datetime parsing with Pandas 2.0+ mixed-format support
ts = pd.Timestamp("2025-10-01 12:34:56")
# (Usage of parse_datetime utility here)
# Formatting utilities
fmt = FloatFormat(precision=2)
print(fmt.format_value(1234.567))
# Table helpers (v1.3.0 constructor requirements)
df = pd.DataFrame({"Metric": ["Trips"], "Value": ["1,200"]})
style = TableColumnStyle()
table = Table(
data=df,
max_table_height=0.5,
mid_x=0.5,
top_y=0.8,
fontsize=11,
columns={
"Metric": TableColumn(detail_style=style, header_style=style),
"Value": TableColumn(detail_style=style, header_style=style)
}
)
Data Integrity & Hashing
import pandas as pd
from dsr_utils.hashing import calculate_object_hash, calculate_file_hash
from pathlib import Path
# Generate a deterministic hash for a DataFrame
df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
df_hash = calculate_object_hash(df)
print(f"DataFrame Fingerprint: {df_hash}")
# Calculate hash for a raw data file without loading it entirely into memory
# Ideal for large CSVs on memory-constrained systems like a Mac-mini
file_path = Path("data/raw/adult.csv")
file_hash = calculate_file_hash(file_path)
print(f"File Fingerprint: {file_hash}")
Dynamic Function Execution
from dsr_utils.reflection import safe_call
def process_data(data, mode="fast", verbose=False):
return f"Processing {data} in {mode} mode"
# A dictionary containing both valid and invalid parameters
raw_config = {
"mode": "thorough",
"verbose": True,
"unsupported_param": "ignore_me"
}
# safe_call filters the config and returns the result + rejected keys
result, rejected = safe_call(process_data, raw_config, data="MyDataset")
print(result) # Output: Processing MyDataset in thorough mode
print(rejected.keys()) # Output: dict_keys(['unsupported_param'])
Requirements
- Python >= 3.10
- numpy >= 2.0.0
- pandas >= 2.0.0
- joblib >= 1.4.0
- matplotlib (required for matplotlib helpers)
License
MIT License - see LICENSE file for details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dsr_utils-1.5.0.tar.gz.
File metadata
- Download URL: dsr_utils-1.5.0.tar.gz
- Upload date:
- Size: 54.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
936befa5cff7ec43cdd2d1ff7e31d7bced54675d349a89e5e5de589f6e6c1095
|
|
| MD5 |
b2d64f70f54046938c3b132bd1c2a8c8
|
|
| BLAKE2b-256 |
74520890f4bc7903c188761b58eaccdeeb2fd88a6bdfb030563f7764b62c9a81
|
Provenance
The following attestation bundles were made for dsr_utils-1.5.0.tar.gz:
Publisher:
python-publish.yml on scottroberts140/dsr-utils
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dsr_utils-1.5.0.tar.gz -
Subject digest:
936befa5cff7ec43cdd2d1ff7e31d7bced54675d349a89e5e5de589f6e6c1095 - Sigstore transparency entry: 1338843863
- Sigstore integration time:
-
Permalink:
scottroberts140/dsr-utils@143cb96b68fa201c46baa0a45f07e9524cf33314 -
Branch / Tag:
refs/tags/v1.5.0 - Owner: https://github.com/scottroberts140
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@143cb96b68fa201c46baa0a45f07e9524cf33314 -
Trigger Event:
release
-
Statement type:
File details
Details for the file dsr_utils-1.5.0-py3-none-any.whl.
File metadata
- Download URL: dsr_utils-1.5.0-py3-none-any.whl
- Upload date:
- Size: 44.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ca14d46fa70d9934f70be8a426e4a09c1fa3bf8441cc1709a9066f9453ae490
|
|
| MD5 |
42de03b0f39e7b3211e4584e7027d1b4
|
|
| BLAKE2b-256 |
1332dc8bb7e2a64dfa1d9baead7fc9c418bef02b7835e1da2dbcb57fb3b7b87a
|
Provenance
The following attestation bundles were made for dsr_utils-1.5.0-py3-none-any.whl:
Publisher:
python-publish.yml on scottroberts140/dsr-utils
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dsr_utils-1.5.0-py3-none-any.whl -
Subject digest:
9ca14d46fa70d9934f70be8a426e4a09c1fa3bf8441cc1709a9066f9453ae490 - Sigstore transparency entry: 1338843898
- Sigstore integration time:
-
Permalink:
scottroberts140/dsr-utils@143cb96b68fa201c46baa0a45f07e9524cf33314 -
Branch / Tag:
refs/tags/v1.5.0 - Owner: https://github.com/scottroberts140
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@143cb96b68fa201c46baa0a45f07e9524cf33314 -
Trigger Event:
release
-
Statement type: