Skip to main content

Solution for specific models

Project description

utilsds-models

A library of classes and functions used in DS Team modeling projects. It extends the utilsds package with components specific to selected models (data processing, NGR metrics, EVIP metrics, and a custom LightGBM objective).

Requires Python >= 3.12.

Installation

uv sync
source .venv/bin/activate

Or from PyPI (after publication):

pip install utilsds-models

Modules

data_processing

Scikit-learn-compatible classes and helper functions for combining test results.

  • ColumnCopyImputer: fills missing values by copying from other columns.
  • NullImputerWithFlags: imputes nulls with optional flags (_isnull_flag) and strategies mean, median, mode.
  • MaxMultiplierImputer: imputes using max * multiplier from training data, with optional flags.
  • SportsHybridEncoder: encodes sports as binary features for the top N disciplines plus aggregations for the rest.
  • LabelEncoderTransformer: label encoding for categorical columns (e.g. for LightGBM).
  • DerivedFeatureCreator: creates derived features (e.g. division by 7 or 30).
  • combine_test_data: combines test features, target, predictions, and metadata into a single DataFrame.

custom_metrics

Evaluation metrics with time-based weights (days_since_ftd) and a custom LightGBM objective.

  • EvalMetric: cohort-level (aggregated by days_since_ftd) and sample-level metrics with time-decay weighting:
    • cohort_weighted_mae, cohort_weighted_mse, cohort_weighted_mape
    • sample_weighted_mae, sample_weighted_mse, sample_weighted_mape
    • create_lgb_metric: factory for LightGBM metrics with time weights
  • DaysWeightedObjective: custom LightGBM objective with time weights (modes: mae, mse, mape).

metrics

  • calculate_ngr_metrics: computes NGR error metrics (MAE, MAPE, ME, MPE) in both standard and business-optimal variants with weights based on days_since_ftd. Optionally applies late-stage prediction correction over the customer lifecycle.

visualization

  • calculate_ngr_metrics: NGR evaluation function (equivalent to the metrics module).

evip_dynamic

Classification metrics with a false-positive (FP) budget constraint.

  • recall_with_fp_cap: recall with a penalty for exceeding the FP budget in binary classification.
  • weighted_premium_recall_with_fp_cap: weighted recall for premium classes (1 and 2) with a penalty for exceeding the FPR budget among class 0 samples.

Dependencies

  • pandas>=2.2.2
  • numpy>=1.26.0
  • scikit-learn>=1.5.0
  • matplotlib>=3.9.0

Publishing to PyPI

uv pip install build twine
uv run python -m build
twine upload --skip-existing dist/*

Before publishing, bump the version in pyproject.toml (section [project]).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

utilsds_models-0.0.2.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

utilsds_models-0.0.2-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file utilsds_models-0.0.2.tar.gz.

File metadata

  • Download URL: utilsds_models-0.0.2.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.6

File hashes

Hashes for utilsds_models-0.0.2.tar.gz
Algorithm Hash digest
SHA256 1a0404be26802ce8fa9cea68422d0475ff614f6186b6215ce52db6baee4e51f9
MD5 4f15f324df79bf9398387fc5f0def4fe
BLAKE2b-256 e7901b76c3b5bfc08c299f009cc45a0b0af82c1d6a90386a1c539bdb9333477a

See more details on using hashes here.

File details

Details for the file utilsds_models-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: utilsds_models-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.6

File hashes

Hashes for utilsds_models-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fd818c00a910ee3f369b18bc8712cfa4870b764a810bdd5600ab4be70664c752
MD5 f02583400dcaf60bc82b952157bbf71e
BLAKE2b-256 034a738ad51628d041ba5afd9310e642e389090303abc10b3cd6303af0ec19cf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page