Skip to main content

woe is a Python library designed to convert categorical and continuous variables into weight of evidence. Weight of evidence is a statistical technique used in information theory to measure the strength of a relationship between a binary target variable and a predictor variable. The library can be used for data preprocessing in predictive modeling or machine learning projects.

Reason this release was yanked:

Bug

Project description

woe

woe is a Python library designed to convert categorical and continuous variables into weight of evidence. Weight of evidence is a statistical technique used in information theory to measure the "strength" of a relationship between a binary target variable and a predictor variable. The library can be used for data preprocessing in predictive modeling or machine learning projects.

installation

pip3 install woe-conversion

usage

from woe_conversion.woe import *

# create an instance of the WoeConversion class
woemodel = WoeConversion(binarytarget='survived', features=['categorical_variable_1','categorical_variable_2','continuous_variable_3'])

# fit the model using training data (train is a pandas dataframe)
woemodel.fit(train)

# transform the training and test data using the fitted model (train and test are pandas dataframes)
transformedtrain = woemodel.transform(train)
transformedtest = woemodel.transform(test)

In the above code, WoeConversion is the class that is used to perform the conversion of variables to weight of evidence. The binarytarget parameter is the name of the binary target column in your dataset, and features is a list of the columns that you want to convert to weight of evidence.

Once the model has been created, it is fit to the training data using the fit method. This method calculates the weight of evidence for each category in the specified columns, and stores the results in the model object.

The transform method is then used to transform the training and test data using the fitted model. This method replaces the original columns with their weight of evidence equivalents.

Note on Missing Values

The weight of evidence is also calculated for missing values. Therefore, missing values should not be imputed before calling the woe model.

Full working example

from woe_conversion.woe import *
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

titanic = sns.load_dataset('titanic')
train, test = train_test_split(titanic, test_size=0.30,random_state=111,stratify=titanic['survived'])

woemodel = WoeConversion(binarytarget='survived', features=['age','sex','class','fare'])
woemodel.fit(train)
transformedtrain = woemodel.transform(train)
transformedtest = woemodel.transform(test)

clf = LogisticRegression()
clf.fit(transformedtrain[['age','sex','class','fare']], transformedtrain['survived'])

transformedtrain['proba'] = clf.predict_proba(transformedtrain[['age','sex','class','fare']])[:,1]
train['proba'] = clf.predict_proba(transformedtrain[['age','sex','class','fare']])[:,1]
transformedtest['proba'] = clf.predict_proba(transformedtest[['age','sex','class','fare']])[:,1]
test['proba'] = clf.predict_proba(transformedtest[['age','sex','class','fare']])[:,1]

Author

woe was created by Bertrand Brelier. If you have any questions or issues, please feel free to contact the author.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

woe_conversion-0.1.0.tar.gz (4.2 kB view details)

Uploaded Source

Built Distribution

woe_conversion-0.1.0-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file woe_conversion-0.1.0.tar.gz.

File metadata

  • Download URL: woe_conversion-0.1.0.tar.gz
  • Upload date:
  • Size: 4.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for woe_conversion-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cd28c5c9af845dc71ab454a69d36d1bb83080f356bab25425a3b29315de4ae4b
MD5 327a24473b48b38dc6a46cc7f6da2421
BLAKE2b-256 fc464d3a58b53703f98f50de9d5da789efb438d2e61252b18049b1a4041f9586

See more details on using hashes here.

File details

Details for the file woe_conversion-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for woe_conversion-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 77e74ce3ab5a67c8a432cad11047be36ffeff42274c90add91b51c6ec3e752ba
MD5 017f6606a091a6129c38c6b90f2146b0
BLAKE2b-256 90dcb3f8a62e2a66fd515ed19a9508cc3177a6e1aef5552b9a4e12960347e7f7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page