Solution for specific models
Project description
utilsds-models
A library of classes and functions used in DS Team modeling projects. It extends the utilsds package with components specific to selected models (data processing, NGR metrics, EVIP metrics, and a custom LightGBM objective).
Requires Python >= 3.12.
Installation
uv sync
source .venv/bin/activate
Or from PyPI (after publication):
pip install utilsds-models
Modules
data_processing
Scikit-learn-compatible classes and helper functions for combining test results.
ColumnCopyImputer: fills missing values by copying from other columns.NullImputerWithFlags: imputes nulls with optional flags (_isnull_flag) and strategiesmean,median,mode.MaxMultiplierImputer: imputes usingmax * multiplierfrom training data, with optional flags.SportsHybridEncoder: encodes sports as binary features for the top N disciplines plus aggregations for the rest.LabelEncoderTransformer: label encoding for categorical columns (e.g. for LightGBM).DerivedFeatureCreator: creates derived features (e.g. division by 7 or 30).combine_test_data: combines test features, target, predictions, and metadata into a single DataFrame.
custom_metrics
Evaluation metrics with time-based weights (days_since_ftd) and a custom LightGBM objective.
EvalMetric: cohort-level (aggregated bydays_since_ftd) and sample-level metrics with time-decay weighting:cohort_weighted_mae,cohort_weighted_mse,cohort_weighted_mapesample_weighted_mae,sample_weighted_mse,sample_weighted_mapecreate_lgb_metric: factory for LightGBM metrics with time weights
DaysWeightedObjective: custom LightGBM objective with time weights (modes:mae,mse,mape).
metrics
calculate_ngr_metrics: computes NGR error metrics (MAE, MAPE, ME, MPE) in both standard and business-optimal variants with weights based ondays_since_ftd. Optionally applies late-stage prediction correction over the customer lifecycle.
visualization
calculate_ngr_metrics: NGR evaluation function (equivalent to themetricsmodule).
evip_dynamic
Classification metrics with a false-positive (FP) budget constraint.
recall_with_fp_cap: recall with a penalty for exceeding the FP budget in binary classification.weighted_premium_recall_with_fp_cap: weighted recall for premium classes (1 and 2) with a penalty for exceeding the FPR budget among class 0 samples.
Dependencies
pandas>=2.2.2numpy>=1.26.0scikit-learn>=1.5.0matplotlib>=3.9.0
Publishing to PyPI
uv pip install build twine
uv run python -m build
twine upload --skip-existing dist/*
Before publishing, bump the version in pyproject.toml (section [project]).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file utilsds_models-0.0.5.tar.gz.
File metadata
- Download URL: utilsds_models-0.0.5.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4be86197bbcbf9b4b45e4b7d3b95f24dda0d3437d0ab1287963da53126804f3f
|
|
| MD5 |
257b49dd840481f12ef8c3f1c736c472
|
|
| BLAKE2b-256 |
9e36e609b1a281144abb07d1766968b71bcf034a418a5bb37bbcecf114e2e951
|
File details
Details for the file utilsds_models-0.0.5-py3-none-any.whl.
File metadata
- Download URL: utilsds_models-0.0.5-py3-none-any.whl
- Upload date:
- Size: 16.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d53db09324b557ee3481076923133c340626dfce8cb6719d24f9b8bd648b95c0
|
|
| MD5 |
52a9eeec8042f1cc1944f7ef9e39201b
|
|
| BLAKE2b-256 |
229e4772924ca8464e487ae3ba33f782d0eef45a417fe82b7c5ea401dfd38af2
|