This package provides helper utilities for machine learning tasks. One major utility is calculation of weight of evidence
Project description
Machine Learning Helper
This package usage multiple algorithms and parameters to accomodate different set of use cases to help in creating multiple machine learning algorithms.
1.0 woe (Weight of Evidence):
This function will help to calculate Weight of Evidence and Information Value, the charts can be displayed and coarse classing can also be done.
1.1 Parameters:
- max_bin: int Maximum number of bins for numeric variables. The default is 10
- iv_threshold: float Threshold value for Information Value. Variables with higher than threshold will be considered for transformation
- ignore_threshold: Boolean This parameter controls whether the defined threshold should be considered or ignored. The default is 'True'
1.2 Returns:
DataFrame having weight of evidence of each column along with the target variable
1.3 Approach:
-
Create an instance of woe my_woe = woe()
-
Call fit method on the defined object by passing on dataframe and the target variable name my_woe.fit(df,target)
-
Call the transform method transformed_df = my_woe.transform()
Example
Create Sample DataFrame
from mlh import woe
import pandas as pd
import numpy as np
import random
seed=1456
np.random.seed(seed)
random.seed(seed)
rows = 1000
y = random.choices([0,1],k=rows,weights=[.7,.3])
x1 = random.choices(np.arange(20,40),k=rows)
x2 = np.random.randint(1000,2000,size=rows)
x3 = random.choices(np.arange(1,100),k=rows)
x4 = random.choices(['m','f','u'],k=rows)
x5 = random.choices(['a','b','c','d','e','f','g','h'],k=rows)
df = pd.DataFrame({'y':y,'x1':x1,'x2':x2,'x3':x3,'x4':x4,'x5':x5})
df.head()
Fitting and prediction
Create Instance of Weight of Evidence Package
my_woe = woe()
Fit the data with created instance
my_woe.fit(df,'y')
Display the relevant charts
my_woe.getWoeCharts()
Merge values of X3 Variable at 1 and 2 indices using the Weight of Evidence chart from the first Iteration
my_woe.reset_woe(2,(1,2),1)
Get latest Iteration Information Value
my_woe.get_IV()
Replace the original values in the Dataframe with Weight of Evidence
transformed_df = my_woe.transform()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mlh-0.0.6.tar.gz
.
File metadata
- Download URL: mlh-0.0.6.tar.gz
- Upload date:
- Size: 19.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c373c0ae1e1af830d69df5baf2217768852a77c92b2d9ca08620a0f87e4465ed |
|
MD5 | 04c21263239d13ecbe12273f7d0bd95f |
|
BLAKE2b-256 | cec995aace0fa6bdf62e0d9b6d2f40531a7707e2d4474ced8bc8da8c4fa30ab8 |
File details
Details for the file mlh-0.0.6-py3-none-any.whl
.
File metadata
- Download URL: mlh-0.0.6-py3-none-any.whl
- Upload date:
- Size: 22.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5d63d5d52a5c061c2d4eb2b5a74b357cca19268d3832c9e88a23c1162c1bf318 |
|
MD5 | 6e2232040bd06ba49342df240d3fe26c |
|
BLAKE2b-256 | 97954e0c4904fbcdaa36b3722e173b185f4b79307f881dbafee97842181903b4 |