Skip to main content

Select features using Kolmogorov–Smirnov (K-S) test for binary classification tasks.

Project description

KSFeatureSelector is a lightweight Python package for selecting the most discriminatory features in a binary classification problem using the Kolmogorov–Smirnov (K-S) test.

Features

  • Uses the K-S test to rank features by their ability to separate classes.

  • Supports filtering features by: - A maximum p-value threshold. - A fixed number of top features.

  • Numpy-style docstrings and validations.

  • Pure Python using pandas and scipy.

Installation

From PyPI:

pip install ksfeatureselector

Local installation:

pip install .

Usage

from ksfeatureselector import select_ks_features

x_cols = ['feature1', 'feature2', 'feature3']
y_var = 'target'

# Select top features based on p-value or top-n count
select_ks_features(df, x_cols, y_var, top_p=0.05)
# or
select_ks_features(df, x_cols, y_var, top_n=5)

Arguments

  • df (pd.DataFrame): The input DataFrame containing feature columns and a binary target column.

  • x_cols (List[str]): List of column names in df to be considered as features.

  • y_var (str): The name of the target column in df. Must be binary (e.g., 0/1 or True/False).

  • top_p (float, optional): Select features whose K-S test p-value is less than this threshold. Use this for statistical significance filtering.

  • top_n (int, optional): Select the top N features with the smallest p-values, ranked by their ability to distinguish between the two classes.

Returns

  • List[str]: A list of selected feature names based on the K-S test ranking.

License

MIT License

Author

V Subrahmanya Raghu Ram Kishore Parupudi Email: pvsrrkishore@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ksfeatureselector-0.1.1.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ksfeatureselector-0.1.1-py3-none-any.whl (4.6 kB view details)

Uploaded Python 3

File details

Details for the file ksfeatureselector-0.1.1.tar.gz.

File metadata

  • Download URL: ksfeatureselector-0.1.1.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for ksfeatureselector-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d9b20ccbb0dfebefa04abaf91cd66e82db1f256fbb5206ebbd8276b844717621
MD5 065364ad96926ea5525c9067e0dc61fa
BLAKE2b-256 ea5e59213dfabc9464647b13c340542c73ebc4b9997c3f90466a4c131faaadab

See more details on using hashes here.

File details

Details for the file ksfeatureselector-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for ksfeatureselector-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2cce387dd25670d874fea39c0034fdfc1e66d111cf801839f2a3c3cb8a6c6d17
MD5 d493681908adca9eb9466b70e5ef6370
BLAKE2b-256 3a720f641ee45ec3a2a36af08ab42b65005ea8bc2e25d047761bd8ee0f5e6fd4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page