Skip to main content

Mo ta ngan

Project description

Feature Selection with Hybrid TFFS

Description

The get_list_feature_tffs_hybrid function is used for feature selection based on high occurrence frequency during the Random Forest construction process, while also supporting integration with traditional feature selection methods.

Syntax

get_list_feature_tffs_hybrid(df, number_of_runs, n_estimators, percent, type=None, percent_hybrid=None)

Parameters

Parameter Data Type Description
df DataFrame DataFrame containing input data (dependent variable in the first column).
number_of_runs int Number of Random Forest runs to determine important feature frequencies.
n_estimators int Number of trees in the Random Forest.
percent float Percentage of features selected based on the highest occurrence frequency.
type str, optional (Optional) Traditional feature selection method for integration.
percent_hybrid float, optional (Optional) Percentage of features retained after hybrid selection.

Valid Values for type

If type is used, one of the following feature selection methods can be chosen:

type Value Feature Selection Method
"MI" Mutual Information
"PC" Pearson Correlation
"FS" Fisher Score
"BW" Backward Selection
"FW" Forward Selection
"RC" Recursive Feature Elimination (RFE)
"LS" Lasso Regression

Usage

Using Random Forest Only for Feature Selection

import pandas as pd
from sff.app import get_list_feature_tffs_hybrid

# Sample DataFrame
data = {
    "Class": [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
    "Feature1": [5, 8, 6, 7, 5, 8, 6, 7, 5, 8],
    "Feature2": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    "Feature3": [10, 20, 10, 20, 10, 20, 10, 20, 10, 20]
}
df = pd.DataFrame(data)

# Run function
selected_features = get_list_feature_tffs_hybrid(df, number_of_runs=10, n_estimators=100, percent=20)
print("Selected Features:", selected_features)

Integrating with Mutual Information (MI)

selected_features = get_list_feature_tffs_hybrid(df, number_of_runs=10, n_estimators=100, percent=20, type="MI", percent_hybrid=50)
print("Selected Features:", selected_features)

Integrating with Recursive Feature Elimination (RFE)

selected_features = get_list_feature_tffs_hybrid(df, number_of_runs=15, n_estimators=200, percent=30, type="RC", percent_hybrid=40)
print("Selected Features:", selected_features)

Integrating with Mutual Information (MI) - Example Code

import pandas as pd
from sff.app import get_list_feature_tffs_hybrid

# Sample DataFrame
data = {
    "Class": [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
    "Feature1": [5, 8, 6, 7, 5, 8, 6, 7, 5, 8],
    "Feature2": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    "Feature3": [10, 20, 10, 20, 10, 20, 10, 20, 10, 20]
}
df = pd.DataFrame(data)

# Run function
selected_features = get_list_feature_tffs_hybrid(df, 5, 20, 75, "MI", 50)
print("Selected Features:", selected_features)

Notes

  • If type is not provided, the function will use only Random Forest for feature selection.
  • If type is provided, the function will integrate Random Forest with the specified traditional method for optimal feature selection.
  • percent_hybrid is applicable only when type is used.
  • Note: percent (4th parameter) must be greater than percent_hybrid.
  • Note: df must have the class column as the first column.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sff-0.1.0.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sff-0.1.0-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file sff-0.1.0.tar.gz.

File metadata

  • Download URL: sff-0.1.0.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.5

File hashes

Hashes for sff-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f28e487baeb819136cad84362aeadb24efcebd1f38f0b3f4498a980da220f835
MD5 1f67de7f6e726297428a50a8a55126d2
BLAKE2b-256 23433e0155637363d16f50762814cefc1b8ec9dd259c38dd0a308bf0d23d8d17

See more details on using hashes here.

File details

Details for the file sff-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: sff-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.5

File hashes

Hashes for sff-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 98d1831638851628f7ac5fb1afa289be3f876e5db630ebefee578541cb5578a2
MD5 cd9a61c454e660af9922e845f9d7a2eb
BLAKE2b-256 dd984a03da89b8d7d6d3a5e702a8a588fe2f2f6952d56b03cc46209252ed570c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page