CTGAN-ENN : Tabular GAN-based Hybrid sampling method
Project description
CTGAN-ENN
CTGAN-ENN : Tabular GAN-based Hybrid sampling method.
- A sampling method that combine CTGAN (Conditional Tabular GAN) and ENN(Edited Nearest Neighboor)
- CTGAN is a powerfull oversampling method based on GAN for tabular data
- ENN is an efficient undersampling method to remove overlapped data
Installation
Install CTGAN-ENN using pip:
pip install ctganenn
Usage
Variables
- minClass: the minority class in the dataset (dataframe).
- majClass: the majority class in the dataset (dataframe).
- genData: how much data that you want generate from minorty class.
- targetLabel: what is your target label name in dataset.
Example Usage
from ctganenn import CTGANENN
use the CTGANENN function with 4 variables
CTGANENN(minClass,majClass,genData,targetLabel)
Output
the output of method are X and y :
- X : all features of your dataset
- y : target label of your dataset
Classification process
you can process the X and y variable to the next step for classification stage. For example using Decision Tree Classifier:
model = tree.DecisionTreeClassifier()
classification = model.fit(X, y)
Limitation
CTGAN-ENN on this version only works for binary classification
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ctganenn-1.0.5.tar.gz
(2.2 kB
view hashes)