woe is a Python library designed to convert categorical and continuous variables into weight of evidence. Weight of evidence is a statistical technique used in information theory to measure the strength of a relationship between a binary target variable and a predictor variable. The library can be used for data preprocessing in predictive modeling or machine learning projects.
Project description
woe
woe is a Python library designed to convert categorical and continuous variables into weight of evidence. Weight of evidence is a statistical technique used in information theory to measure the "strength" of a relationship between a binary target variable and a predictor variable. The library can be used for data preprocessing in predictive modeling or machine learning projects.
installation
pip3 install woe-conversion
usage
from woe_conversion.woe import *
# create an instance of the WoeConversion class
woemodel = WoeConversion(binarytarget='survived', features=['categorical_variable_1','categorical_variable_2','continuous_variable_3'])
# fit the model using training data
woemodel.fit(train)
# transform the training and test data using the fitted model
transformedtrain = woemodel.transform(train)
transformedtest = woemodel.transform(test)
In the above code, WoeConversion is the class that is used to perform the conversion of variables to weight of evidence. The binarytarget parameter is the name of the binary target column in your dataset, and features is a list of the columns that you want to convert to weight of evidence.
Once the model has been created, it is fit to the training data using the fit method. This method calculates the weight of evidence for each category in the specified columns, and stores the results in the model object.
The transform method is then used to transform the training and test data using the fitted model. This method replaces the original columns with their weight of evidence equivalents.
Note on Missing Values
The weight of evidence is also calculated for missing values. Therefore, missing values should not be imputed before calling the woe model.
Full working example
from woe_conversion.woe import *
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
titanic = sns.load_dataset('titanic')
train, test = train_test_split(titanic, test_size=0.30,random_state=111,stratify=titanic['survived'])
woemodel = WoeConversion(binarytarget='survived', features=['age','sex','class','fare'])
woemodel.fit(train)
transformedtrain = woemodel.transform(train)
transformedtest = woemodel.transform(test)
clf = LogisticRegression()
clf.fit(transformedtrain[['age','sex','class','fare']], transformedtrain['survived'])
transformedtrain['proba'] = clf.predict_proba(transformedtrain[['age','sex','class','fare']])[:,1]
train['proba'] = clf.predict_proba(transformedtrain[['age','sex','class','fare']])[:,1]
transformedtest['proba'] = clf.predict_proba(transformedtest[['age','sex','class','fare']])[:,1]
test['proba'] = clf.predict_proba(transformedtest[['age','sex','class','fare']])[:,1]
Author
woe was created by Bertrand Brelier. If you have any questions or issues, please feel free to contact the author.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file woe_conversion-0.0.5.tar.gz
.
File metadata
- Download URL: woe_conversion-0.0.5.tar.gz
- Upload date:
- Size: 4.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 083a3a90785f88cab671bf233f1ac68039bff3c08c06e875d456e49b9ce7565c |
|
MD5 | 681894ac6ff6178e90c8007c12f686d1 |
|
BLAKE2b-256 | 0b942dfa3e1c6fc79bacc61970f8ca3d482e5ca0a101a3093df9de50c6fabfc0 |
File details
Details for the file woe_conversion-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: woe_conversion-0.0.5-py3-none-any.whl
- Upload date:
- Size: 5.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 11cd8bf24b8e1d549e0ca64277812ba440dba883d6864981cfd9cd0ef99d7d82 |
|
MD5 | c31348d7b5e6d231dc2dfaf498eb1de8 |
|
BLAKE2b-256 | 815205c152855cf8704a9cb2e1f3daecd6370faf7899dd4ca9896e0299d83735 |