Statistical IV: Statistical Hypothesis Testing for the Information Value (IV). Evaluation of the predictive power of features using the IV with specific thresholds for each dataset.
Project description
Statistical IV
Our J-Divergence test is under the next null hypothesis
H0: The predictive power of the variable is not significant.
The null hypothesis is tested using a two-tailed distribution, and this should be taken into consideration when interpreting the p-value.
Explanation
Optimize your machine learning models with 'Statistical-IV'. Perform automated feature selection based on statistics and customize error control.
-
Import package
from statistical_iv import api
-
Provide a DataFrame as Input:
- Supply a DataFrame
dfcontaining your data for IV calculation.
- Supply a DataFrame
-
Specify Predictor Variables:
- Prived a list of predictor variable names (
variables_names) to analyze.
- Prived a list of predictor variable names (
-
Define the Target Variable:
- Specify the name of the target variable (
var_y) in your DataFrame.
- Specify the name of the target variable (
-
Indicate Variable Types:
- Define the type of your predictor variables as 'categorical' or 'numerical' using the
type_varsparameter.
- Define the type of your predictor variables as 'categorical' or 'numerical' using the
-
Optional: Set Maximum Bins:
- Adjust the maximum number of bins for discretization (optional) using the
max_binsparameter.
- Adjust the maximum number of bins for discretization (optional) using the
-
Call the
statistical_ivFunction:- Calculate Statistical IV information by calling the
statistical_ivfunction from api with the specified parameters (That is used for OptimalBinning package).
result_df = api.statistical_iv(df, variables_names, var_y, type_vars, max_bins)
- Calculate Statistical IV information by calling the
Example Result:
Full Paper:
For a comprehensive exploration of the topic, we recommend perusing the contents of the article available at this link.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file statistical_iv-0.3.2.tar.gz.
File metadata
- Download URL: statistical_iv-0.3.2.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3dea33014c282356f6e4837645226aa59a2984a2644916eb0db9a83b3b2f195
|
|
| MD5 |
85b4f5dfb25634ab491bafbd2ab00298
|
|
| BLAKE2b-256 |
da45aaaeb4769f37ea854ecb6caeb93331af5b7922bef3e33237c9ff7b2b456a
|
File details
Details for the file statistical_iv-0.3.2-py3-none-any.whl.
File metadata
- Download URL: statistical_iv-0.3.2-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2204b6ea563d594d290303067df94b3a7bf8af3e1187438e80740eac3c7039c9
|
|
| MD5 |
41144199b60f218bafc9e054ca92d05a
|
|
| BLAKE2b-256 |
16976d41bbef3442a44c87b92376a9588399f400fd9405425d9d73ebae1d2586
|