Skip to main content

CTGAN-ENN : Tabular GAN-based Hybrid sampling method

Project description

CTGAN-ENN

CTGAN-ENN : Tabular GAN-based Hybrid sampling method.

  • A sampling method that combine CTGAN (Conditional Tabular GAN) and ENN(Edited Nearest Neighboor)
  • CTGAN is a powerfull oversampling method based on GAN for tabular data
  • ENN is an efficient undersampling method to remove overlapped data

Installation

Install CTGAN-ENN using pip:

pip install ctganenn

Usage

Variables

  • minClass: the minority class in the dataset (dataframe).
  • majClass: the majority class in the dataset (dataframe).
  • genData: how much data that you want generate from minorty class.
  • targetLabel: what is your target label name in dataset.

Example Usage

from ctganenn import CTGANENN

use the CTGANENN function with 4 variables

CTGANENN(minClass,majClass,genData,targetLabel)

Output

the output of method are X and y :

  • X : all features of your dataset
  • y : target label of your dataset

Classification process

you can process the X and y variable to the next step for classification stage. For example using Decision Tree Classifier:

model = tree.DecisionTreeClassifier()
classification = model.fit(X, y)

Limitation

CTGAN-ENN on this version only works for binary classification

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctganenn-1.0.5.tar.gz (2.2 kB view hashes)

Uploaded Source

Built Distribution

ctganenn-1.0.5-py3-none-any.whl (2.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page