Skip to main content

A feature selection method based on identifying features that best segregate classes via their underlying probablity density estimations

Project description

PDE-Seg (PDE-Segregate) is a univariate filter feature selection method based on a filter-measure that ranks features according to their ability to segregate the probability density estimates (PDE) of the class samples.

Install

PDE-Seg can be installed from PyPI:

pip install pdeseg

Example

from pdeseg import PDE_Segregate

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification

# Create random classification dataset
X, y = make_classification(
    n_samples=300, n_features=50, n_classes=3, n_informative=5,
    shuffle=False
)

# Initialize PDE-Segregate object
pdeRanker = PDE_Segregate()

# Carry out feature selection
pdeRanker.fit(X, y)

# Get top 10 features
top10Features = pdeRanker.get_topnFeatures(10)

# Visualize the top feature's ability to segregate PDEs
fig, axs = plt.subplots(1, 2, sharey=True)

# Top ranked feature
pdeRanker.plot_overlapAreas(top10Features[0], legend="intersection", _ax=axs[0])
axs[0].set_title("Most relevant feature", loc="left")

# Last ranked feature
pdeRanker.plot_overlapAreas(49, legend="intersection", _ax=axs[1])
axs[1].set_title("Least relevant feature", loc="left")

axs[0].set_ylabel(r"Probability Density, $\hat{P}$")
for i in range(2):
    axs[i].set_xlim(-0.5, 1.5)
    axs[i].set_xticks(np.arange(-0.5, 2.0, 0.5))

Check out the notebooks provided as tutorials and examples of some specific use cases.

Citation

For now, cite the followinng abstract

J.C. Liaw, F. Geu Flores. A novel univariate feature selection filter-measure based on the reduction of class overlapping. 94th Annual Meeting of the International Association of Applied Mathematics and Mechanics - GAMM, Magdeburg, Deutschland, 18.-22. March 2024, Oral Presentation S25.01-4

Available at Book of Abstracts of the 94th Annual Meeting of the International Association of Applied Mathematics and Mechanics, p363

The other feature selection methods that were compared to in our paper is as listed below:

  1. LH-RELIEF: Feature weight estimation for gene selection: a local hyperlinear learning approach DOI: https://doi.org/10.1186/1471-2105-15-70

  2. I-RELIEF: Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications DOI: https://doi.org/10.1109/TPAMI.2007.1093

  3. RELIEF-F: Estimating attributes: Analysis and extensions of RELIEF DOI: https://doi.org/10.1007/3-540-57868-4_57

  4. MultiSURF: Benchmarking relief-based feature selection methods for bioinformatics data mining DOI: https://doi.org/10.1016/j.jbi.2018.07.015

  5. Random Forests DOI: https://doi.org/10.1023/A:1010933404324

  6. ANOVA F-statistic: Statistical Methods for Research Workers

  7. Mutual Information: Estimating mutual information DOI: https://doi.org/10.1103/PhysRevE.69.066138

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdeseg-0.1.1.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdeseg-0.1.1-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file pdeseg-0.1.1.tar.gz.

File metadata

  • Download URL: pdeseg-0.1.1.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pdeseg-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4aaa98b14e43a50386f4b9e07cd3d069b6661d655b28ff5e174a3c12ebe0cb23
MD5 e197a1f1d8ce3edb6dc0079ab7936ed0
BLAKE2b-256 c25eebdb7010706ab983986bfceb5a09a5c74d9bb8da38ad0223ec78b1bb19a2

See more details on using hashes here.

File details

Details for the file pdeseg-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pdeseg-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pdeseg-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 984aacefa0d167b7288e10cd4fe9f1a022291b56d726d53bfb1bbaf0bb3b3df0
MD5 d9da3c006d045f68ac755c6dfcbf353f
BLAKE2b-256 d1dd9b4f555ff4399cecac7f066ff90af4d3affa591e5b1e8a45130edccc4feb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page