Skip to main content

Pandas/DataFrame utilities for data manipulation, filtering, aggregation, and schema management

Project description

dr_frames

Pandas/DataFrame utilities for data manipulation, filtering, aggregation, and schema management.

Installation

pip install dr-frames

For table formatting features (console, markdown, latex):

pip install dr-frames[formatting]

Quick Start

import pandas as pd
from dr_frames import (
    coerce_numeric_cols,
    filter_to_range,
    move_cols_to_beginning,
    select_subset,
)

df = pd.DataFrame({
    "name": ["alice", "bob", "charlie"],
    "value": ["1.0", "2.0", "3.0"],
    "category": ["x", "y", "x"],
})

result = (
    df.pipe(coerce_numeric_cols, ["value"])
    .pipe(select_subset, {"category": "x"})
    .pipe(filter_to_range, "value", 0.5, 2.5)
)

Module Overview

Module Purpose Key Functions
columns Column selection & reordering move_cols_to_beginning, get_cols_by_prefix, strip_col_prefixes
filtering Row filtering select_subset, filter_to_range, make_filter_fxn
cells Cell-level operations ensure_column, map_column_with_fallback, force_set_cell
types Type coercion coerce_numeric_cols, coerce_string_cols
aggregation GroupBy & reduction aggregate_over_seeds, apply_aggregations, unique_non_null
parsing String list parsing parse_first_element, sum_list_elements, is_homogeneous
schema Data field metadata DataField, ComputedField, DataFormat
profiling Column auto-tagging DFColInfo, ColInfo, looks_like_json
formatting Table output format_table, format_coverage_table

Documentation

Auto-generated API Docs

# Serve interactive docs locally
uv run pdoc dr_frames

# Generate static HTML
uv run pdoc dr_frames -o docs/api_html

Quick Reference

Column Operations

from dr_frames import (
    contained_cols,          # cols that exist in df
    remaining_cols,          # cols NOT in a list
    get_cols_by_prefix,      # cols starting with prefix
    get_cols_by_contains,    # cols containing substring
    move_cols_to_beginning,  # reorder cols
    move_cols_with_prefix_to_end,
    strip_col_prefixes,      # rename by removing prefix
    drop_all_null_cols,      # remove empty columns
)

Filtering

from dr_frames import (
    select_subset,           # filter by exact column values
    apply_filters_to_df,     # filter by value lists
    filter_to_value,         # single value filter
    filter_to_values,        # multi-value filter
    filter_to_range,         # numeric range filter
    filter_to_best_metric,   # keep best per group
    make_filter_fxn,         # compose filters
)

Cell Operations

from dr_frames import (
    ensure_column,           # add column if missing
    fill_missing_values,     # fillna with defaults dict
    rename_columns,          # safe rename (skips missing)
    map_column_with_fallback,# map values, keep unmapped
    apply_column_converters, # apply functions to columns
    maybe_update_cell,       # update if currently null
    force_set_cell,          # always update
    masked_getter,           # get value where mask is true
    masked_setter,           # set value where mask is true
)

Type Coercion

from dr_frames import (
    coerce_numeric_cols,     # convert to float/int
    coerce_string_cols,      # convert to string dtype
    is_string_series,        # check if series is strings
)

Aggregation

from dr_frames import (
    aggregate_over_seeds,    # mean/std/count by config
    apply_aggregations,      # flexible groupby
    unique_non_null,         # unique values excluding null
    unique_by_col,           # unique values in column
    get_constant_cols,       # cols with single value
    fillna_with_defaults,    # fill nulls from dict
    maybe_pipe,              # conditional pipe
)

Parsing

from dr_frames import (
    parse_list_string,       # "[1,2,3]" -> [1,2,3]
    parse_first_element,     # "[1,2,3]" -> 1.0
    sum_list_elements,       # "[1,2,3]" -> 6.0
    is_homogeneous,          # "[1,1,1]" -> True
)

Schema

from dr_frames import (
    DataField,               # field with metadata
    ComputedField,           # derived field
    MetricDataField,         # metric with group info
    DataFormat,              # container for fields
)

Profiling

from dr_frames import (
    DFColInfo,               # catalog of column info
    ColInfo,                 # single column metadata
    looks_like_json,         # detect JSON strings
    looks_like_path,         # detect file paths
    infer_series_base_tag_type,  # infer dtype tags
)

Formatting (requires [formatting] extra)

from dr_frames import (
    format_table,            # render table in multiple formats
    format_coverage_table,   # show column coverage stats
    FORMATTER_TYPES,         # available formatters
    OUTPUT_FORMATS,          # available output formats
)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dr_frames-0.1.0.tar.gz (55.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dr_frames-0.1.0-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file dr_frames-0.1.0.tar.gz.

File metadata

  • Download URL: dr_frames-0.1.0.tar.gz
  • Upload date:
  • Size: 55.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for dr_frames-0.1.0.tar.gz
Algorithm Hash digest
SHA256 392a70f26113c920879ef4520712a3c1288a93075173516587bb7155fe92880e
MD5 53d508416f40d79b4b375a4e65d87b23
BLAKE2b-256 ce6775c8a60ec9f684895a6ebb86a8bb13bf07595e887af8e5e4219429d6fca4

See more details on using hashes here.

File details

Details for the file dr_frames-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dr_frames-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for dr_frames-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 67dab7f55c5aabf6a84449f5c4fd35fd7940b60b37279a819c519225270ca8bd
MD5 52609dd598c0d0d281e684cf4873ebd8
BLAKE2b-256 7959be08fd884b3470102d1a0a5f2fd5b2403151a2b862ad8ff440605feaef3b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page