Skip to main content

Mo ta ngan

Project description

Feature Selection with Hybrid TFFS

Description

The get_list_feature_tffs_hybrid function is used for feature selection based on high occurrence frequency during the Random Forest construction process, while also supporting integration with traditional feature selection methods.

Syntax

get_list_feature_tffs_hybrid(df, number_of_runs, n_estimators, percent, type=None, percent_hybrid=None)

Parameters

Parameter Data Type Description
df DataFrame DataFrame containing input data (dependent variable in the first column).
number_of_runs int Number of Random Forest runs to determine important feature frequencies.
n_estimators int Number of trees in the Random Forest.
percent float Percentage of features selected based on the highest occurrence frequency.
type str, optional (Optional) Traditional feature selection method for integration.
percent_hybrid float, optional (Optional) Percentage of features retained after hybrid selection.

Valid Values for type

If type is used, one of the following feature selection methods can be chosen:

type Value Feature Selection Method
"MI" Mutual Information
"PC" Pearson Correlation
"FS" Fisher Score
"BW" Backward Selection
"FW" Forward Selection
"RC" Recursive Feature Elimination (RFE)
"LS" Lasso Regression

Usage

Using Random Forest Only for Feature Selection

import pandas as pd
from sff.app import get_list_feature_tffs_hybrid

# Sample DataFrame
data = {
    "Class": [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
    "Feature1": [5, 8, 6, 7, 5, 8, 6, 7, 5, 8],
    "Feature2": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    "Feature3": [10, 20, 10, 20, 10, 20, 10, 20, 10, 20]
}
df = pd.DataFrame(data)

# Run function
selected_features = get_list_feature_tffs_hybrid(df, number_of_runs=10, n_estimators=100, percent=20)
print("Selected Features:", selected_features)

Integrating with Mutual Information (MI)

selected_features = get_list_feature_tffs_hybrid(df, number_of_runs=10, n_estimators=100, percent=75, type="MI", percent_hybrid=50)
print("Selected Features:", selected_features)

Integrating with Recursive Feature Elimination (RFE)

selected_features = get_list_feature_tffs_hybrid(df, number_of_runs=15, n_estimators=200, percent=75, type="RC", percent_hybrid=50)
print("Selected Features:", selected_features)

Integrating with Mutual Information (MI) - Example Code

import pandas as pd
from sff.app import get_list_feature_tffs_hybrid

# Sample DataFrame
data = {
    "Class": [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
    "Feature1": [5, 8, 6, 7, 5, 8, 6, 7, 5, 8],
    "Feature2": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    "Feature3": [10, 20, 10, 20, 10, 20, 10, 20, 10, 20]
}
df = pd.DataFrame(data)

# Run function
selected_features = get_list_feature_tffs_hybrid(df, 5, 20, 75, "MI", 50)
print("Selected Features:", selected_features)

Notes

  • If type is not provided, the function will use only Random Forest for feature selection.
  • If type is provided, the function will integrate Random Forest with the specified traditional method for optimal feature selection.
  • percent_hybrid is applicable only when type is used.
  • Note: percent (4th parameter) must be greater than percent_hybrid.
  • Note: df must have the class column as the first column.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sff-1.0.0.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sff-1.0.0-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file sff-1.0.0.tar.gz.

File metadata

  • Download URL: sff-1.0.0.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.5

File hashes

Hashes for sff-1.0.0.tar.gz
Algorithm Hash digest
SHA256 c5b054e1d0414262ad737f06620bd95da461b8bd53ca4e0efe03866d8fd2b9de
MD5 3a0b470eb2a296c80e7542605ae7dc85
BLAKE2b-256 ce8a83b43750423d9eef7ae20b5104d575e00c92a146a1ee49964988c266218e

See more details on using hashes here.

File details

Details for the file sff-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: sff-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 6.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.5

File hashes

Hashes for sff-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c25cae9cd78577ca5652ea4087261cfd7e8bbeebff3f9193baf5571c089cf5ff
MD5 a629204b014cd33bcc3c01fd19fa596d
BLAKE2b-256 1d377ef3705da2e07e6bb871c613fefc8548ef69f31fa1dd21925d6ed72857d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page