Skip to main content

An end-to-end feature selection distribution with linear runtime(number of features) complexity.

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

GFS Network

Gumbel Feature Selection Network is a deep learning model that can be used to select the most important features from a given dataset. The model is based on the Gumbel-Sigmoid distribution.

Installation

To install the package, you can use pip:

pip install gfs_network

Usage examples

Basic usage

from gfs_network import GFSNetwork
from sklearn.datasets import load_breast_cancer

breast = load_breast_cancer()
X = breast.data
y = breast.target

gfs = GFSNetwork()
X = gfs.fit_transform(X, y)

print(gfs.support_)
print(gfs.scores_)

Performance verification

from gfs_network import GFSNetwork
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import balanced_accuracy_score

DEVICE = "cpu"

breast = load_breast_cancer()
X = breast.data
y = breast.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
clf = RandomForestClassifier(random_state=42)
clf.train(X_train, y_train)
orig_score = balanced_accuracy_score(y_test, clf.predict(X_test))

print(f"Original score: {orig_score:.3f}. Original features: {X.shape[1]}")
# Original score: 0.958. Original features: 30

gfs = GFSNetwork(verbose=True, device=DEVICE)
gfs.fit(X_train, y_train)

X_transformed = gfs.transform(X_train)
X_test_transformed = gfs.transform(X_test)

clf.fit(X_transformed, y_train)
y_pred = clf.predict(X_test_transformed)
score = balanced_accuracy_score(y_test, y_pred)
logger.info(f"Score after feature selection: {score}. Selected features: {sum(gfs.support_)}")
# Score after feature selection: 0.958. Selected features: 3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gfs_network-0.3.0.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gfs_network-0.3.0-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file gfs_network-0.3.0.tar.gz.

File metadata

  • Download URL: gfs_network-0.3.0.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.18 Linux/5.10.0-30-amd64

File hashes

Hashes for gfs_network-0.3.0.tar.gz
Algorithm Hash digest
SHA256 5faa564547242d38d74c0a1f389846327fe42034e5cd1278c9ab408767e5dfae
MD5 4aec789aee35c06495b75383a7be04bf
BLAKE2b-256 867c58d90af944215db336a4e7fba7053884825eb0032b2879fd5c04ccb9f2e1

See more details on using hashes here.

File details

Details for the file gfs_network-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: gfs_network-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 6.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.18 Linux/5.10.0-30-amd64

File hashes

Hashes for gfs_network-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b319149b2a9b47cedaf1e300b6ce6c7ca83b9b196d74c4a96e3fbd31d67edaf9
MD5 73e3920903435fa54143d33d5fbdeb11
BLAKE2b-256 af0c15be489cc50cb6e4e2d4cc0831b055da2a3cef035f0dea89257054e93f44

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page