Skip to main content

ProD: A visualizable filter-feature selection method based on prodding the class probability densities for overlapping

Project description

ProD, a visualizable filter-feature selection method based on "prodding" the class Probability Densities for overlapping.

Install

ProD can be installed from PyPI:

pip install prod-fs

Example

from prodfs import ProD

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification

# Create random classification dataset
X, y = make_classification(
    n_samples=300, n_features=50, n_classes=3, n_informative=5,
    shuffle=False
)

# Initialize ProD object
prodRanker = ProD()

# Carry out feature selection
prodRanker.fit(X, y)

# Get top 10 features
top10Features = prodRanker.get_topnFeatures(10)

# Visualize the top feature's ability to segregate PDEs
fig, axs = plt.subplots(1, 2, sharey=True)

# Top ranked feature
prodRanker.plot_overlapAreas(top10Features[0], legend="intersection", _ax=axs[0])
axs[0].set_title("Most relevant feature", loc="left")

# Last ranked feature
prodRanker.plot_overlapAreas(49, legend="intersection", _ax=axs[1])
axs[1].set_title("Least relevant feature", loc="left")

axs[0].set_ylabel(r"Probability Density, $\hat{P}$")
for i in range(2):
    axs[i].set_xlim(-0.5, 1.5)
    axs[i].set_xticks(np.arange(-0.5, 2.0, 0.5))

Check out the notebooks provided as tutorials and examples of some specific use cases.

Citation

For now, cite the followinng abstract

J.C. Liaw, F. Geu Flores. A novel univariate feature selection filter-measure based on the reduction of class overlapping. 94th Annual Meeting of the International Association of Applied Mathematics and Mechanics - GAMM, Magdeburg, Deutschland, 18.-22. March 2024, Oral Presentation S25.01-4

Available at Book of Abstracts of the 94th Annual Meeting of the International Association of Applied Mathematics and Mechanics, p363

The other feature selection methods that were compared to in our paper is as listed below:

  1. LH-RELIEF: Feature weight estimation for gene selection: a local hyperlinear learning approach DOI: https://doi.org/10.1186/1471-2105-15-70

  2. I-RELIEF: Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications DOI: https://doi.org/10.1109/TPAMI.2007.1093

  3. RELIEF-F: Estimating attributes: Analysis and extensions of RELIEF DOI: https://doi.org/10.1007/3-540-57868-4_57

  4. MultiSURF: Benchmarking relief-based feature selection methods for bioinformatics data mining DOI: https://doi.org/10.1016/j.jbi.2018.07.015

  5. Random Forests DOI: https://doi.org/10.1023/A:1010933404324

  6. ANOVA F-statistic: Statistical Methods for Research Workers

  7. Mutual Information: Estimating mutual information DOI: https://doi.org/10.1103/PhysRevE.69.066138

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prod_fs-1.0.6.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prod_fs-1.0.6-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file prod_fs-1.0.6.tar.gz.

File metadata

  • Download URL: prod_fs-1.0.6.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.9 {"installer":{"name":"uv","version":"0.11.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for prod_fs-1.0.6.tar.gz
Algorithm Hash digest
SHA256 a683779e9b63b5d7d841c0b361c8f7e954bb001e978aaf3218afce59a244fc6b
MD5 26983c4a0fe59e75ce26e84636cfb2cb
BLAKE2b-256 85bd210542d7bf95d0cdb8d0aeeaa14da3bed3be2e93d6e1a602d8c9af225c7b

See more details on using hashes here.

File details

Details for the file prod_fs-1.0.6-py3-none-any.whl.

File metadata

  • Download URL: prod_fs-1.0.6-py3-none-any.whl
  • Upload date:
  • Size: 8.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.9 {"installer":{"name":"uv","version":"0.11.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for prod_fs-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 72cfeacf9549f009ebe15f3398e6e1a4417135c64cbdd4afbf5b2ab8e32ccea6
MD5 30b30411c64a13d158f86eaedeb5d269
BLAKE2b-256 39a59f748de1d8c9699aec7098e14ab277b4f3dced7af78d6c0deb60d1d8b76d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page