Value-driven and cost-sensitive tools for scikit-learn

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

shrahman

These details have not been verified by PyPI

Project links

Project description

Tests

Empulse

Empulse is a package aimed to enable value-driven and cost-sensitive analysis in Python. The package implements popular value-driven and cost-sensitive metrics and algorithms in accordance to sci-kit learn conventions. This allows the measures to seamlessly integrate into existing ML workflows.

Installation

Empulse requires python 3.10 or higher.

Install empulse via pip with

pip install empulse

Documentation

You can find the documentation here.

Features

Ready to use out of the box with scikit-learn
Use case specific profit and cost metrics
Flexible profit-driven and cost-sensitive models
Easy passing of instance-dependent costs
Cost-aware resampling and relabeling
Find the optimal decision threshold
Easy access to real-world datasets for benchmarking

Take the tour

Ready to use out of the box with scikit-learn

All components of the package are designed to work seamlessly with scikit-learn.

Models are implemented as scikit-learn estimators and can be used anywhere a scikit-learn estimator can be used.

Pipelines

from empulse.models import CSLogitClassifier
from sklearn.datasets import make_classification
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

X, y = make_classification()
pipeline = Pipeline([
    ("scaler", StandardScaler()),
    ("model", CSLogitClassifier())
])
pipeline.fit(X, y, model__fp_cost=2, model__fn_cost=1)

Cross-validation

from sklearn.model_selection import cross_val_score

cross_val_score(
    pipeline, 
    X, 
    y, 
    scoring="roc_auc", 
    params={"model__fp_cost": 2, "model__fn_cost": 1}
)

Grid search

from sklearn.model_selection import GridSearchCV

param_grid = {"model__C": [0.1, 1, 10]}
grid_search = GridSearchCV(pipeline, param_grid, scoring="roc_auc")
grid_search.fit(X, y, model__fp_cost=2, model__fn_cost=1)

All metrics can easily be converted as scikit-learn scorers and can be used in the same way as any other scikit-learn scorer.

from empulse.metrics import expected_cost_loss
from sklearn.metrics import make_scorer

scorer = make_scorer(
    expected_cost_loss, 
    response_method="predict_proba", 
    greater_is_better=False,
    fp_cost=2,
    fn_cost=1
)
cross_val_score(pipeline, X, y, scoring=scorer)

Use case specific profit and cost metrics

Empulse offers a wide range of profit and cost metrics that are tailored to specific use cases such as:

customer churn,
customer acquisition,
credit scoring,
and fraud detection (coming soon).

For other use cases, the package provides a generic implementations for:

the Metric class to define your own custom metrics,
the cost loss,
the expected cost loss,
the expected log cost loss,
the savings score,
the expected savings score,
and the maximum profit score.

Flexible profit-driven and cost-sensitive models

Empulse provides a range of profit-driven and cost-sensitive models such as:

Each classifier tries to balance ease of use through good defaults and flexibility through a wide range of parameters.

For instance, the CSLogitClassifier allows you to change the loss function and the optimization method:

import numpy as np
from empulse.models import CSLogitClassifier
from empulse.metrics import expected_savings_score
from scipy.optimize import minimize, OptimizeResult

def optimize(objective, X, **kwargs) -> OptimizeResult:
    initial_guess = np.zeros(X.shape[1])
    result = minimize(
        lambda x: -objective(x),  # inverse objective function to maximize
        initial_guess,
        method='BFGS',
        **kwargs
    )
    return result
model = CSLogitClassifier(loss=expected_savings_score, optimize_fn=optimize)

Easy passing of instance-dependent costs

Instance-dependent costs can easily be passed to the models through metadata routing.

For instance, the instance-dependent costs are passed dynamically to each fold of the cross-validation through requesting the costs in the set_fit_request method of the model and the set_score_request method of the scorer.

import numpy as np
from empulse.models import CSLogitClassifier
from empulse.metrics import expected_cost_loss
from sklearn import set_config
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.metrics import make_scorer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

set_config(enable_metadata_routing=True)

X, y = make_classification()
fp_cost = np.random.rand(y.size)
fn_cost = np.random.rand(y.size)

pipeline = Pipeline([
    ("scale", StandardScaler()),
    ("model", CSLogitClassifier().set_fit_request(fp_cost=True, fn_cost=True))
])

scorer = make_scorer(
    expected_cost_loss,
    response_method="predict_proba",
    greater_is_better=False,
).set_score_request(fp_cost=True, fn_cost=True)

cross_val_score(pipeline, X, y, scoring=scorer, params={"fp_cost": fp_cost, "fn_cost": fn_cost})

Cost-aware resampling and relabeling

Empulse uses the imbalanced-learn package to provide cost-aware resampling and relabeling techniques:

from empulse.samplers import CostSensitiveSampler
from sklearn.datasets import make_classification

X, y = make_classification()
sampler = CostSensitiveSampler()
X_resampled, y_resampled = sampler.fit_resample(X, y, fp_cost=2, fn_cost=1)

They can be used in an imbalanced-learn pipeline:

import numpy as np
from empulse.samplers import CostSensitiveSampler
from imblearn.pipeline import Pipeline
from sklearn import set_config
from sklearn.datasets import make_classification
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

set_config(enable_metadata_routing=True)

X, y = make_classification()
fp_cost = np.random.rand(y.size)
fn_cost = np.random.rand(y.size)
pipeline = Pipeline([
    ("scaler", StandardScaler()),
    ("sampler", CostSensitiveSampler().set_fit_resample_request(fp_cost=True, fn_cost=True)),
    ("model", LogisticRegression())
])

pipeline.fit(X, y, fp_cost=fp_cost, fn_cost=fn_cost)

Find the optimal decision threshold

Empulse provides the CSThresholdClassifier which allows you to find the optimal decision threshold for a given cost matrix to minimize the expected cost loss.

The meta-estimator changes the predict method of the base estimator to predict the class with the lowest expected cost.

from empulse.models import CSThresholdClassifier
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression

X, y = make_classification()
model = CSThresholdClassifier(estimator=LogisticRegression())
model.fit(X, y)
model.predict(X, fp_cost=2, fn_cost=1)

Metrics like the maximum profit score conveniently return the optimal target threshold. For example the Expected Maximum Profit measure for customer churn (EMPC) tells you what fraction of the customer base should be targeted to maximize profit.

from empulse.metrics import empc
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression

X, y = make_classification()
model = LogisticRegression()
predictions = model.fit(X, y).predict_proba(X)[:, 1]

score, threshold = empc(y, predictions, clv=50)

This score can then be converted to a decision threshold by using the classification_threshold function.

from empulse.metrics import classification_threshold

decision_threshold = classification_threshold(y, predictions, customer_threshold=threshold)

This can then be combined with sci-kit learn's FixedThresholdClassifier to create a model that predicts the class with the highest expected profit.

from sklearn.model_selection import FixedThresholdClassifier

model = FixedThresholdClassifier(estimator=model, threshold=decision_threshold)
model.predict(X)

Easy access to real-world datasets for benchmarking

Empulse provides easy access to real-world datasets for benchmarking cost-sensitive models.

Each dataset returns the features, the target, and the instance-dependent costs, ready to use in a cost-sensitive model.

from empulse.datasets import load_give_me_some_credit
from empulse.models import CSLogitClassifier
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

X, y, tp_cost, fp_cost, tn_cost, fn_cost = load_give_me_some_credit(return_X_y_costs=True)

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('model', CSLogitClassifier())
])
pipeline.fit(
    X, 
    y, 
    model__tp_cost=tp_cost, 
    model__fp_cost=fp_cost, 
    model__tn_cost=tn_cost, 
    model__fn_cost=fn_cost
)

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

shrahman

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.11.1

May 8, 2026

0.11.0

May 8, 2026

0.10.4

Sep 20, 2025

0.9.0

Jun 15, 2025

This version

0.8.0

Jun 1, 2025

0.7.0

Feb 5, 2025

0.6.0

Jan 28, 2025

0.5.2

Jan 12, 2025

0.5.1

Jan 5, 2025

0.5.0

Jan 5, 2025

0.4.6

Dec 31, 2024

0.4.5

Dec 30, 2024

0.4.4

Dec 26, 2024

0.4.3

Dec 26, 2024

0.4.2

Dec 24, 2024

0.4.1

Dec 16, 2024

0.4.0

Dec 11, 2024

0.3.1

Apr 19, 2024

0.3.0

Apr 18, 2024

0.2.0

Apr 17, 2024

0.1.2

Feb 28, 2024

0.1.1

Feb 25, 2024

0.1.0

Feb 24, 2024

0.0.15

Jan 22, 2024

0.0.14

Jan 18, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

empulse-0.8.0.tar.gz (4.0 MB view details)

Uploaded Jun 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

empulse-0.8.0-py3-none-any.whl (4.1 MB view details)

Uploaded Jun 1, 2025 Python 3

File details

Details for the file empulse-0.8.0.tar.gz.

File metadata

Download URL: empulse-0.8.0.tar.gz
Upload date: Jun 1, 2025
Size: 4.0 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for empulse-0.8.0.tar.gz
Algorithm	Hash digest
SHA256	`cbd516b57b9f9172e812d31397d05d201306271a2cd124571bd53f758bb4fa74`
MD5	`61456d98bf776012622521b652e3c564`
BLAKE2b-256	`b3a2c4b3f25f904b61452b3103e8debb263abf68f3fd8600ccbb6dd5eadbbade`

See more details on using hashes here.

Provenance

The following attestation bundles were made for empulse-0.8.0.tar.gz:

Publisher: release.yml on ShimantoRahman/empulse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: empulse-0.8.0.tar.gz
- Subject digest: cbd516b57b9f9172e812d31397d05d201306271a2cd124571bd53f758bb4fa74
- Sigstore transparency entry: 226893960
- Sigstore integration time: Jun 1, 2025
Source repository:
- Permalink: ShimantoRahman/empulse@4cedb17bad2bc2362e9bb59a19f39c373ab37c2b
- Branch / Tag: refs/tags/0.8.0
- Owner: https://github.com/ShimantoRahman
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@4cedb17bad2bc2362e9bb59a19f39c373ab37c2b
- Trigger Event: push

File details

Details for the file empulse-0.8.0-py3-none-any.whl.

File metadata

Download URL: empulse-0.8.0-py3-none-any.whl
Upload date: Jun 1, 2025
Size: 4.1 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for empulse-0.8.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1ce7fe886a1933d10d115bbd91554a1858708729708938667ba6c3db95ada90e`
MD5	`01e9d1acecda82beadabccc4964f3f04`
BLAKE2b-256	`5a929908459b8b46da4a47a2e89909cea165340940b7ec2824e569eab6b66ce3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for empulse-0.8.0-py3-none-any.whl:

Publisher: release.yml on ShimantoRahman/empulse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: empulse-0.8.0-py3-none-any.whl
- Subject digest: 1ce7fe886a1933d10d115bbd91554a1858708729708938667ba6c3db95ada90e
- Sigstore transparency entry: 226893962
- Sigstore integration time: Jun 1, 2025
Source repository:
- Permalink: ShimantoRahman/empulse@4cedb17bad2bc2362e9bb59a19f39c373ab37c2b
- Branch / Tag: refs/tags/0.8.0
- Owner: https://github.com/ShimantoRahman
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@4cedb17bad2bc2362e9bb59a19f39c373ab37c2b
- Trigger Event: push

empulse 0.8.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Empulse

Installation

Documentation

Features

Take the tour

Ready to use out of the box with scikit-learn

Pipelines

Cross-validation

Grid search

Use case specific profit and cost metrics

Flexible profit-driven and cost-sensitive models

Easy passing of instance-dependent costs

Cost-aware resampling and relabeling

Find the optimal decision threshold

Easy access to real-world datasets for benchmarking

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance