A feature selection method based on identifying features that best segregate classes via their underlying probablity density estimations
Project description
PDE-Seg (PDE-Segregate) is a univariate filter feature selection method based on a filter-measure that ranks features according to their ability to segregate the probability density estimates (PDE) of the class samples.
Install
PDE-Seg can be installed from PyPI:
pip install pdeseg
Example
from pdeseg import PDE_Segregate
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
# Create random classification dataset
X, y = make_classification(
n_samples=300, n_features=50, n_classes=3, n_informative=5,
shuffle=False
)
# Initialize PDE-Segregate object
pdeRanker = PDE_Segregate()
# Carry out feature selection
pdeRanker.fit(X, y)
# Get top 10 features
top10Features = pdeRanker.get_topnFeatures(10)
# Visualize the top feature's ability to segregate PDEs
fig, axs = plt.subplots(1, 2, sharey=True)
# Top ranked feature
pdeRanker.plot_overlapAreas(top10Features[0], legend="intersection", _ax=axs[0])
axs[0].set_title("Most relevant feature", loc="left")
# Last ranked feature
pdeRanker.plot_overlapAreas(49, legend="intersection", _ax=axs[1])
axs[1].set_title("Least relevant feature", loc="left")
axs[0].set_ylabel(r"Probability Density, $\hat{P}$")
for i in range(2):
axs[i].set_xlim(-0.5, 1.5)
axs[i].set_xticks(np.arange(-0.5, 2.0, 0.5))
Check out the notebooks provided as tutorials and examples of some specific use cases.
Citation
For now, cite the followinng abstract
J.C. Liaw, F. Geu Flores. A novel univariate feature selection filter-measure based on the reduction of class overlapping. 94th Annual Meeting of the International Association of Applied Mathematics and Mechanics - GAMM, Magdeburg, Deutschland, 18.-22. March 2024, Oral Presentation S25.01-4
The other feature selection methods that were compared to in our paper is as listed below:
-
LH-RELIEF: Feature weight estimation for gene selection: a local hyperlinear learning approach DOI: https://doi.org/10.1186/1471-2105-15-70
-
I-RELIEF: Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications DOI: https://doi.org/10.1109/TPAMI.2007.1093
-
RELIEF-F: Estimating attributes: Analysis and extensions of RELIEF DOI: https://doi.org/10.1007/3-540-57868-4_57
-
MultiSURF: Benchmarking relief-based feature selection methods for bioinformatics data mining DOI: https://doi.org/10.1016/j.jbi.2018.07.015
-
Random Forests DOI: https://doi.org/10.1023/A:1010933404324
-
ANOVA F-statistic: Statistical Methods for Research Workers
-
Mutual Information: Estimating mutual information DOI: https://doi.org/10.1103/PhysRevE.69.066138
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdeseg-0.1.4.tar.gz.
File metadata
- Download URL: pdeseg-0.1.4.tar.gz
- Upload date:
- Size: 7.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94614cb3c1ea9892aba15b2accf995efdf7edefc5b1c3840a8befeb94f16bdf3
|
|
| MD5 |
504e45b47aeae2a58e8bf845b803e0fd
|
|
| BLAKE2b-256 |
56ffedf030407d64cbf32400f448293d6442bba939ca56405a9b4d1634e79055
|
File details
Details for the file pdeseg-0.1.4-py3-none-any.whl.
File metadata
- Download URL: pdeseg-0.1.4-py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
379e997ae9dcbf6e87addd818632d201b574aa55d99da9686d776c45e123803c
|
|
| MD5 |
5439551de116b8c49c097fb13810b759
|
|
| BLAKE2b-256 |
0e50de59c73842c50714941b0a48039ca78516fc146ed33ac8dba784c8688317
|