Skip to main content

Solution for specific models

Project description

utilsds-models

A library of classes and functions used in DS Team modeling projects. It extends the utilsds package with components specific to selected models (data processing, NGR metrics, EVIP metrics, and a custom LightGBM objective).

Requires Python >= 3.12.

Installation

uv sync
source .venv/bin/activate

Or from PyPI (after publication):

pip install utilsds-models

Modules

data_processing

Scikit-learn-compatible classes and helper functions for combining test results.

  • ColumnCopyImputer: fills missing values by copying from other columns.
  • NullImputerWithFlags: imputes nulls with optional flags (_isnull_flag) and strategies mean, median, mode.
  • MaxMultiplierImputer: imputes using max * multiplier from training data, with optional flags.
  • SportsHybridEncoder: encodes sports as binary features for the top N disciplines plus aggregations for the rest.
  • LabelEncoderTransformer: label encoding for categorical columns (e.g. for LightGBM).
  • DerivedFeatureCreator: creates derived features (e.g. division by 7 or 30).
  • combine_test_data: combines test features, target, predictions, and metadata into a single DataFrame.

custom_metrics

Evaluation metrics with time-based weights (days_since_ftd) and a custom LightGBM objective.

  • EvalMetric: cohort-level (aggregated by days_since_ftd) and sample-level metrics with time-decay weighting:
    • cohort_weighted_mae, cohort_weighted_mse, cohort_weighted_mape
    • sample_weighted_mae, sample_weighted_mse, sample_weighted_mape
    • create_lgb_metric: factory for LightGBM metrics with time weights
  • DaysWeightedObjective: custom LightGBM objective with time weights (modes: mae, mse, mape).

metrics

  • calculate_ngr_metrics: computes NGR error metrics (MAE, MAPE, ME, MPE) in both standard and business-optimal variants with weights based on days_since_ftd. Optionally applies late-stage prediction correction over the customer lifecycle.

visualization

  • calculate_ngr_metrics: NGR evaluation function (equivalent to the metrics module).

evip_dynamic

Classification metrics with a false-positive (FP) budget constraint.

  • recall_with_fp_cap: recall with a penalty for exceeding the FP budget in binary classification.
  • weighted_premium_recall_with_fp_cap: weighted recall for premium classes (1 and 2) with a penalty for exceeding the FPR budget among class 0 samples.

Dependencies

  • pandas>=2.2.2
  • numpy>=1.26.0
  • scikit-learn>=1.5.0
  • matplotlib>=3.9.0

Publishing to PyPI

uv pip install build twine
uv run python -m build
twine upload --skip-existing dist/*

Before publishing, bump the version in pyproject.toml (section [project]).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

utilsds_models-0.0.5.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

utilsds_models-0.0.5-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file utilsds_models-0.0.5.tar.gz.

File metadata

  • Download URL: utilsds_models-0.0.5.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.6

File hashes

Hashes for utilsds_models-0.0.5.tar.gz
Algorithm Hash digest
SHA256 4be86197bbcbf9b4b45e4b7d3b95f24dda0d3437d0ab1287963da53126804f3f
MD5 257b49dd840481f12ef8c3f1c736c472
BLAKE2b-256 9e36e609b1a281144abb07d1766968b71bcf034a418a5bb37bbcecf114e2e951

See more details on using hashes here.

File details

Details for the file utilsds_models-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: utilsds_models-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.6

File hashes

Hashes for utilsds_models-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d53db09324b557ee3481076923133c340626dfce8cb6719d24f9b8bd648b95c0
MD5 52a9eeec8042f1cc1944f7ef9e39201b
BLAKE2b-256 229e4772924ca8464e487ae3ba33f782d0eef45a417fe82b7c5ea401dfd38af2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page