Select features using Kolmogorov–Smirnov (K-S) test for binary classification tasks.
Project description
KSFeatureSelector is a lightweight Python package for selecting the most discriminatory features in a binary classification problem using the Kolmogorov–Smirnov (K-S) test.
Features
Uses the K-S test to rank features by their ability to separate classes.
Supports filtering features by: - A maximum p-value threshold. - A fixed number of top features.
Numpy-style docstrings and validations.
Pure Python using pandas and scipy.
Installation
From PyPI:
pip install ksfeatureselector
Local installation:
pip install .
Usage
from ksfeatureselector import select_ks_features
x_cols = ['feature1', 'feature2', 'feature3']
y_var = 'target'
# Select top features based on p-value or top-n count
select_ks_features(df, x_cols, y_var, top_p=0.05)
# or
select_ks_features(df, x_cols, y_var, top_n=5)
Arguments
df (pd.DataFrame): The input DataFrame containing feature columns and a binary target column.
x_cols (List[str]): List of column names in df to be considered as features.
y_var (str): The name of the target column in df. Must be binary (e.g., 0/1 or True/False).
top_p (float, optional): Select features whose K-S test p-value is less than this threshold. Use this for statistical significance filtering.
top_n (int, optional): Select the top N features with the smallest p-values, ranked by their ability to distinguish between the two classes.
Returns
List[str]: A list of selected feature names based on the K-S test ranking.
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ksfeatureselector-0.1.1.tar.gz.
File metadata
- Download URL: ksfeatureselector-0.1.1.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9b20ccbb0dfebefa04abaf91cd66e82db1f256fbb5206ebbd8276b844717621
|
|
| MD5 |
065364ad96926ea5525c9067e0dc61fa
|
|
| BLAKE2b-256 |
ea5e59213dfabc9464647b13c340542c73ebc4b9997c3f90466a4c131faaadab
|
File details
Details for the file ksfeatureselector-0.1.1-py3-none-any.whl.
File metadata
- Download URL: ksfeatureselector-0.1.1-py3-none-any.whl
- Upload date:
- Size: 4.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2cce387dd25670d874fea39c0034fdfc1e66d111cf801839f2a3c3cb8a6c6d17
|
|
| MD5 |
d493681908adca9eb9466b70e5ef6370
|
|
| BLAKE2b-256 |
3a720f641ee45ec3a2a36af08ab42b65005ea8bc2e25d047761bd8ee0f5e6fd4
|