Skip to main content

A Python Package for Noise-Tolerant Classification of Binary Data using Prior Knowledge Integration and Max-Cut Solutions

Project description

NoiseCut

GitHub GitHub Workflow Status (with event) Read the Docs PyPI - Version PyPI - Python Version

noisecut is an easy-to-use Python package for the implementation of tree-structured functional networks (FNs) as a model class for the classification of binary data with prior knowledge on input features. FNs can be viewed as modular neural networks, where the structure of the links between the modules and the information flow from input variables to the output variable is pre-determined. Here, each module of the FN is simply represented as a black-box module. The identification of an FN, i.e., learning the input-output function of the FN, is then decomposed to the identification of the individual interior black-box modules.

noisecut can be used for any tree-structured FNs which has the below criteria. It should have

  1. two hidden layer,
  2. arbitrary number of black-box modules in the first hidden layer,
  3. only one black-box module in the second hidden layer,
  4. each input feature goes only to one black-box module (tree structure).

Installation

Dependencies

  • Python (>=3.9)
  • numpy
  • pandas
  • scipy
  • cplex
  • docplex

User installation

Bofore you can use NoiseCut package, you need to install noisecut using pip:

$ pip install noisecut

Simple demo

Code snippet shown below summarizes a complete workflow, starting with the generation of synthetic data, proceeding to the division of data into training and testing sets, and concluding with model fitting and result evaluation.

from noisecut.model.noisecut_coder import Metric
from noisecut.model.noisecut_model import NoiseCut
from noisecut.tree_structured.data_manipulator import DataManipulator
from noisecut.tree_structured.sample_generator import SampleGenerator

# Synthetic data generation
gen_dataset = SampleGenerator(
    [4, 4, 4], allowance_rand=True
)  # [4,4,4] determines the number of inputs to each black box of the FN model
X, y = gen_dataset.get_complete_data_set()

# Add noise in data labeling. Train and test set split.
x_noisy, y_noisy = DataManipulator().get_noisy_data(X, y, percentage_noise=10)
x_train, y_train, x_test, y_test = DataManipulator().split_data(
    x_noisy, y_noisy, percentage_training_data=50
)

# Training
mdl = NoiseCut(
    n_input_each_box=[4, 4, 4]
)  # 'n_input_each_box' should fit to the generated data
mdl.fit(x_train, y_train)

# Evaluation
y_pred = mdl.predict(x_test)
accuracy, recall, precision, F1 = Metric.set_confusion_matrix(y_test, y_pred)

Usage

Various use cases of the useful functions of noisecut package are provided as jupyter notebooks:

Examples show how to use the package to fit the model and investigate the predicted results in score, probability or simple binary output format.

License

noisecut was created by Hedieh Mirzaieazar and Moein E. Samadi. It is licensed under the terms of the GPLv3 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

noisecut-0.2.1.tar.gz (38.0 kB view details)

Uploaded Source

Built Distribution

noisecut-0.2.1-py3-none-any.whl (44.1 kB view details)

Uploaded Python 3

File details

Details for the file noisecut-0.2.1.tar.gz.

File metadata

  • Download URL: noisecut-0.2.1.tar.gz
  • Upload date:
  • Size: 38.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for noisecut-0.2.1.tar.gz
Algorithm Hash digest
SHA256 a06402c3a6c082d7bf06766155ea4fafdb65f248af5882084eb584d6f72ba8be
MD5 1d25ffbb0436b87f7b584abc5a3a7dcf
BLAKE2b-256 dc1411454c564f60f1a1f385417c4b171811bb8b2b39ebdceb81979344055b66

See more details on using hashes here.

File details

Details for the file noisecut-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: noisecut-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 44.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for noisecut-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a6f6cb66b61035e4e8b3c07b959facd8f39e203308cf1fcc778be7c9f2e26b49
MD5 a0c7af131358de56c01496b9d0ac6fd2
BLAKE2b-256 e1d9f013b15af3ae27c169e099f2eee0486d93bbe25ec95fd3981d8d2c3cc6ea

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page