Skip to main content

Pandas/DataFrame utilities for data manipulation, filtering, aggregation, and schema management

Project description

dr_frames

Pandas/DataFrame utilities for data manipulation, filtering, aggregation, and schema management.

Primitive quick reference for agents:

  • primitives.columns: rename_columns, move_cols_to_beginning, strip_col_prefixes, drop_all_null_cols, drop_all_constant_cols, get_cols_by_prefix, get_cols_by_contains, move_cols_with_prefix_to_end, move_numeric_cols_to_end
  • primitives.filtering: select_subset, filter_to_values, filter_to_range, make_filter_fxn
  • primitives.ranking: select_best_by_metric
  • primitives.coerce: coerce_numeric_cols, coerce_string_cols
  • primitives.aggregation: aggregate_over_seeds, aggregate_by_group
  • primitives.missing: fill_missing_values
  • primitives.masked: masked_getter, masked_setter
  • primitives.constant: get_constant_cols, get_groupwise_constant_cols
  • primitives.unique: unique_non_null, unique_by_col, unique_by_cols
  • primitives.namespaced: group_namespaced_values
  • primitives.pipeline: maybe_pipe
  • primitives.parsing: parse_list_string

Installation

uv add dr-frames

Quick Start

import pandas as pd
from dr_frames import (
    coerce_numeric_cols,
    filter_to_range,
    move_cols_to_beginning,
    select_subset,
)

df = pd.DataFrame({
    "name": ["alice", "bob", "charlie"],
    "value": ["1.0", "2.0", "3.0"],
    "category": ["x", "y", "x"],
})

result = (
    df.pipe(coerce_numeric_cols, ["value"])
    .pipe(select_subset, {"category": "x"})
    .pipe(filter_to_range, "value", 0.5, 2.5)
)

Flexible Schema

flexible_schema is the higher-level part of the library for working with dataframes whose columns you did not design yourself.

  • DataField describes a logical field and can resolve which dataframe column it maps to.
  • ComputedField describes a derived field that should be added from existing columns before downstream use.
  • MetricDataField describes metric columns, especially prefixed metric names such as eval/....
  • DataFormat is the container that ties those pieces together and gives you a reusable view of a dataframe schema.

Typical usage:

  • build a DataFormat from a dataframe with DataFormat.from_df(...) or from a field-description mapping with DataFormat.from_dict(...)
  • inspect unresolved fields and discovered metrics
  • add computed fields for columns you want to derive once and reuse
  • call prepare_for_plotting(...) to produce a dataframe restricted to the known fields, computed fields, and metrics
  • use metric_col(...), get_metric(...), and get_config_columns(...) to drive plotting or grouped analysis code without hardcoding raw column names

Minimal example:

from dr_frames import ComputedField, DataField, DataFormat

fmt = DataFormat(
    fields=[
        DataField(id_string="model", column_name="model_name"),
        DataField(id_string="dataset"),
    ],
    computed_fields=[
        ComputedField(
            id_string="is_large",
            source_columns=["params_millions"],
            compute=lambda df: df["params_millions"] > 1000,
        )
    ],
)

plot_df = fmt.prepare_for_plotting(df)
config_cols = fmt.get_config_columns()

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dr_frames-0.1.1.tar.gz (45.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dr_frames-0.1.1-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file dr_frames-0.1.1.tar.gz.

File metadata

  • Download URL: dr_frames-0.1.1.tar.gz
  • Upload date:
  • Size: 45.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.0

File hashes

Hashes for dr_frames-0.1.1.tar.gz
Algorithm Hash digest
SHA256 eccde9fdb2f1cffe387a9a9288d34e07c79c171390312706f66cc6e4dea26561
MD5 fe68f210a65ae440199a88515b5cd0c3
BLAKE2b-256 7e41d1b3dfd6bd89e9abaa23f582066caa956fe2dd46f687857059600a8d30f9

See more details on using hashes here.

File details

Details for the file dr_frames-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: dr_frames-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 18.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.0

File hashes

Hashes for dr_frames-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 da56e1747f0feec97f5c8d61444e1a1d2f9f60134a617056ed096df748532027
MD5 c0dae2e04bbd6dbe747035e8a421401d
BLAKE2b-256 51c307815ee976e49afea20396f1dbf34b20ed7d16f612be135893dc884cfc10

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page