Skip to main content

Statistical IV: Statistical Hypothesis Testing for the Information Value (IV). Evaluation of the predictive power of features using the IV with specific thresholds for each dataset.

Project description

Statistical IV

Our J-Divergence test is under the next null hypothesis

H0: The predictive power of the variable is not significant.

The null hypothesis is tested using a two-tailed distribution, and this should be taken into consideration when interpreting the p-value.

Explanation

Optimize your machine learning models with 'Statistical-IV'. Perform automated feature selection based on statistics and customize error control.

  1. Import package

    from statistical_iv import api
    
  2. Provide a DataFrame as Input:

    • Supply a DataFrame df containing your data for IV calculation.
  3. Specify Predictor Variables:

    • Prived a list of predictor variable names (variables_names) to analyze.
  4. Define the Target Variable:

    • Specify the name of the target variable (var_y) in your DataFrame.
  5. Indicate Variable Types:

    • Define the type of your predictor variables as 'categorical' or 'numerical' using the type_vars parameter.
  6. Optional: Set Maximum Bins:

    • Adjust the maximum number of bins for discretization (optional) using the max_bins parameter.
  7. Call the statistical_iv Function:

    • Calculate Statistical IV information by calling the statistical_iv function from api with the specified parameters (That is used for OptimalBinning package).
    result_df = api.statistical_iv(df, variables_names, var_y, type_vars, max_bins)
    

Example Result:

Output Example

Full Paper:

For a comprehensive exploration of the topic, we recommend perusing the contents of the article available at this link.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

statistical_iv-0.3.2.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

statistical_iv-0.3.2-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file statistical_iv-0.3.2.tar.gz.

File metadata

  • Download URL: statistical_iv-0.3.2.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for statistical_iv-0.3.2.tar.gz
Algorithm Hash digest
SHA256 b3dea33014c282356f6e4837645226aa59a2984a2644916eb0db9a83b3b2f195
MD5 85b4f5dfb25634ab491bafbd2ab00298
BLAKE2b-256 da45aaaeb4769f37ea854ecb6caeb93331af5b7922bef3e33237c9ff7b2b456a

See more details on using hashes here.

File details

Details for the file statistical_iv-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: statistical_iv-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for statistical_iv-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2204b6ea563d594d290303067df94b3a7bf8af3e1187438e80740eac3c7039c9
MD5 41144199b60f218bafc9e054ca92d05a
BLAKE2b-256 16976d41bbef3442a44c87b92376a9588399f400fd9405425d9d73ebae1d2586

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page