Skip to main content

Solution for specific models

Project description

utilsds-models

A library of classes and functions used in DS Team modeling projects. It extends the utilsds package with components specific to selected models (data processing, NGR metrics, EVIP metrics, and a custom LightGBM objective).

Requires Python >= 3.12.

Installation

uv sync
source .venv/bin/activate

Or from PyPI (after publication):

pip install utilsds-models

Modules

data_processing

Scikit-learn-compatible classes and helper functions for combining test results.

  • ColumnCopyImputer: fills missing values by copying from other columns.
  • NullImputerWithFlags: imputes nulls with optional flags (_isnull_flag) and strategies mean, median, mode.
  • MaxMultiplierImputer: imputes using max * multiplier from training data, with optional flags.
  • SportsHybridEncoder: encodes sports as binary features for the top N disciplines plus aggregations for the rest.
  • LabelEncoderTransformer: label encoding for categorical columns (e.g. for LightGBM).
  • DerivedFeatureCreator: creates derived features (e.g. division by 7 or 30).
  • combine_test_data: combines test features, target, predictions, and metadata into a single DataFrame.

custom_metrics

Evaluation metrics with time-based weights (days_since_ftd) and a custom LightGBM objective.

  • EvalMetric: cohort-level (aggregated by days_since_ftd) and sample-level metrics with time-decay weighting:
    • cohort_weighted_mae, cohort_weighted_mse, cohort_weighted_mape
    • sample_weighted_mae, sample_weighted_mse, sample_weighted_mape
    • create_lgb_metric: factory for LightGBM metrics with time weights
  • DaysWeightedObjective: custom LightGBM objective with time weights (modes: mae, mse, mape).

metrics

  • calculate_ngr_metrics: computes NGR error metrics (MAE, MAPE, ME, MPE) in both standard and business-optimal variants with weights based on days_since_ftd. Optionally applies late-stage prediction correction over the customer lifecycle.

visualization

  • calculate_ngr_metrics: NGR evaluation function (equivalent to the metrics module).

evip_dynamic

Classification metrics with a false-positive (FP) budget constraint.

  • recall_with_fp_cap: recall with a penalty for exceeding the FP budget in binary classification.
  • weighted_premium_recall_with_fp_cap: weighted recall for premium classes (1 and 2) with a penalty for exceeding the FPR budget among class 0 samples.

Dependencies

  • pandas>=2.2.2
  • numpy>=1.26.0
  • scikit-learn>=1.5.0
  • matplotlib>=3.9.0

Publishing to PyPI

uv pip install build twine
uv run python -m build
twine upload --skip-existing dist/*

Before publishing, bump the version in pyproject.toml (section [project]).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

utilsds_models-0.0.3.tar.gz (3.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

utilsds_models-0.0.3-py3-none-any.whl (2.5 kB view details)

Uploaded Python 3

File details

Details for the file utilsds_models-0.0.3.tar.gz.

File metadata

  • Download URL: utilsds_models-0.0.3.tar.gz
  • Upload date:
  • Size: 3.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.6

File hashes

Hashes for utilsds_models-0.0.3.tar.gz
Algorithm Hash digest
SHA256 c68d0e743ddcf710fa6519e316b120e586b35ef7e2a30e82f8e90b26a54962c0
MD5 981db5d86cdabdc66b2c89d5884156e3
BLAKE2b-256 7044e424844f2938990e4d3c5d9eeee60de238333d60627c64548beb7b2ec21f

See more details on using hashes here.

File details

Details for the file utilsds_models-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: utilsds_models-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 2.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.6

File hashes

Hashes for utilsds_models-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2563de05b63e0192f3811637d5caa62579ab55614334ed3894526a8bd7ea2389
MD5 54d114531dd6468b0c9d5e33e9f960ec
BLAKE2b-256 aa9a5cbde651111c2479dedab723122f72e85770cd22a5a2c345762ff1868b88

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page