Skip to main content

Solution for specific models

Project description

utilsds-models

A library of classes and functions used in DS Team modeling projects. It extends the utilsds package with components specific to selected models (data processing, NGR metrics, EVIP metrics, and a custom LightGBM objective).

Requires Python >= 3.12.

Installation

uv sync
source .venv/bin/activate

Or from PyPI (after publication):

pip install utilsds-models

Modules

data_processing

Scikit-learn-compatible classes and helper functions for combining test results.

  • ColumnCopyImputer: fills missing values by copying from other columns.
  • NullImputerWithFlags: imputes nulls with optional flags (_isnull_flag) and strategies mean, median, mode.
  • MaxMultiplierImputer: imputes using max * multiplier from training data, with optional flags.
  • SportsHybridEncoder: encodes sports as binary features for the top N disciplines plus aggregations for the rest.
  • LabelEncoderTransformer: label encoding for categorical columns (e.g. for LightGBM).
  • DerivedFeatureCreator: creates derived features (e.g. division by 7 or 30).
  • combine_test_data: combines test features, target, predictions, and metadata into a single DataFrame.

custom_metrics

Evaluation metrics with time-based weights (days_since_ftd) and a custom LightGBM objective.

  • EvalMetric: cohort-level (aggregated by days_since_ftd) and sample-level metrics with time-decay weighting:
    • cohort_weighted_mae, cohort_weighted_mse, cohort_weighted_mape
    • sample_weighted_mae, sample_weighted_mse, sample_weighted_mape
    • create_lgb_metric: factory for LightGBM metrics with time weights
  • DaysWeightedObjective: custom LightGBM objective with time weights (modes: mae, mse, mape).

metrics

  • calculate_ngr_metrics: computes NGR error metrics (MAE, MAPE, ME, MPE) in both standard and business-optimal variants with weights based on days_since_ftd. Optionally applies late-stage prediction correction over the customer lifecycle.

visualization

  • calculate_ngr_metrics: NGR evaluation function (equivalent to the metrics module).

evip_dynamic

Classification metrics with a false-positive (FP) budget constraint.

  • recall_with_fp_cap: recall with a penalty for exceeding the FP budget in binary classification.
  • weighted_premium_recall_with_fp_cap: weighted recall for premium classes (1 and 2) with a penalty for exceeding the FPR budget among class 0 samples.

Dependencies

  • pandas>=2.2.2
  • numpy>=1.26.0
  • scikit-learn>=1.5.0
  • matplotlib>=3.9.0

Publishing to PyPI

uv pip install build twine
uv run python -m build
twine upload --skip-existing dist/*

Before publishing, bump the version in pyproject.toml (section [project]).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

utilsds_models-0.0.4.tar.gz (3.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

utilsds_models-0.0.4-py3-none-any.whl (2.5 kB view details)

Uploaded Python 3

File details

Details for the file utilsds_models-0.0.4.tar.gz.

File metadata

  • Download URL: utilsds_models-0.0.4.tar.gz
  • Upload date:
  • Size: 3.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.6

File hashes

Hashes for utilsds_models-0.0.4.tar.gz
Algorithm Hash digest
SHA256 af7b817ea20c79d9a9f0049f58343c874e1218065726ffcd918864876fe74cbe
MD5 5de45460c286e3982160db5b2adf9e52
BLAKE2b-256 f9ddd0d864da6938289fec3df2ffd00346a6bee4df37c5e9560d15ac37274d84

See more details on using hashes here.

File details

Details for the file utilsds_models-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: utilsds_models-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 2.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.6

File hashes

Hashes for utilsds_models-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 acab9dbd232775820105fee1852c011924ed73204eed65bf16f19b021f0f7035
MD5 e4f78b8038714718bdbbdcd59de98e51
BLAKE2b-256 3a759ee9fb92a0f0f54a64c8842a43fe232ac6ae521bb08060f8cb809e49376a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page