Mo ta ngan
Project description
Feature Selection with Hybrid TFFS
Description
The get_list_feature_tffs_hybrid function is used for feature selection based on high occurrence frequency during the Random Forest construction process, while also supporting integration with traditional feature selection methods.
Syntax
get_list_feature_tffs_hybrid(df, number_of_runs, n_estimators, percent, type=None, percent_hybrid=None)
Parameters
| Parameter | Data Type | Description |
|---|---|---|
df |
DataFrame |
DataFrame containing input data (dependent variable in the first column). |
number_of_runs |
int |
Number of Random Forest runs to determine important feature frequencies. |
n_estimators |
int |
Number of trees in the Random Forest. |
percent |
float |
Percentage of features selected based on the highest occurrence frequency. |
type |
str, optional |
(Optional) Traditional feature selection method for integration. |
percent_hybrid |
float, optional |
(Optional) Percentage of features retained after hybrid selection. |
Valid Values for type
If type is used, one of the following feature selection methods can be chosen:
type Value |
Feature Selection Method |
|---|---|
"MI" |
Mutual Information |
"PC" |
Pearson Correlation |
"FS" |
Fisher Score |
"BW" |
Backward Selection |
"FW" |
Forward Selection |
"RC" |
Recursive Feature Elimination (RFE) |
"LS" |
Lasso Regression |
Usage
Using Random Forest Only for Feature Selection
import pandas as pd
from sff.app import get_list_feature_tffs_hybrid
# Sample DataFrame
data = {
"Class": [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
"Feature1": [5, 8, 6, 7, 5, 8, 6, 7, 5, 8],
"Feature2": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
"Feature3": [10, 20, 10, 20, 10, 20, 10, 20, 10, 20]
}
df = pd.DataFrame(data)
# Run function
selected_features = get_list_feature_tffs_hybrid(df, number_of_runs=10, n_estimators=100, percent=20)
print("Selected Features:", selected_features)
Integrating with Mutual Information (MI)
selected_features = get_list_feature_tffs_hybrid(df, number_of_runs=10, n_estimators=100, percent=75, type="MI", percent_hybrid=50)
print("Selected Features:", selected_features)
Integrating with Recursive Feature Elimination (RFE)
selected_features = get_list_feature_tffs_hybrid(df, number_of_runs=15, n_estimators=200, percent=75, type="RC", percent_hybrid=50)
print("Selected Features:", selected_features)
Integrating with Mutual Information (MI) - Example Code
import pandas as pd
from sff.app import get_list_feature_tffs_hybrid
# Sample DataFrame
data = {
"Class": [0, 1, 0, 1, 0, 1, 0, 1, 0, 1],
"Feature1": [5, 8, 6, 7, 5, 8, 6, 7, 5, 8],
"Feature2": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
"Feature3": [10, 20, 10, 20, 10, 20, 10, 20, 10, 20]
}
df = pd.DataFrame(data)
# Run function
selected_features = get_list_feature_tffs_hybrid(df, 5, 20, 75, "MI", 50)
print("Selected Features:", selected_features)
Notes
- If
typeis not provided, the function will use only Random Forest for feature selection. - If
typeis provided, the function will integrate Random Forest with the specified traditional method for optimal feature selection. percent_hybridis applicable only whentypeis used.- Note:
percent(4th parameter) must be greater thanpercent_hybrid. - Note:
dfmust have the class column as the first column.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sff-1.0.0.tar.gz.
File metadata
- Download URL: sff-1.0.0.tar.gz
- Upload date:
- Size: 7.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5b054e1d0414262ad737f06620bd95da461b8bd53ca4e0efe03866d8fd2b9de
|
|
| MD5 |
3a0b470eb2a296c80e7542605ae7dc85
|
|
| BLAKE2b-256 |
ce8a83b43750423d9eef7ae20b5104d575e00c92a146a1ee49964988c266218e
|
File details
Details for the file sff-1.0.0-py3-none-any.whl.
File metadata
- Download URL: sff-1.0.0-py3-none-any.whl
- Upload date:
- Size: 6.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c25cae9cd78577ca5652ea4087261cfd7e8bbeebff3f9193baf5571c089cf5ff
|
|
| MD5 |
a629204b014cd33bcc3c01fd19fa596d
|
|
| BLAKE2b-256 |
1d377ef3705da2e07e6bb871c613fefc8548ef69f31fa1dd21925d6ed72857d2
|