Skip to main content

Transcendent adaptation for multiclass problems

Project description

Transcendent Multiclass

CI status Version

This repository enables users to apply Transcendent-like concept drift detection to both binary and multiclass problems.

Modifications have been made specifically to the ICE (Inductive Conformal Evaluator) implementation, while other solutions (i.e. TCE, CCE, etc.) are out of scope. Furthermore, the thresholding phase is temporarily disabled due to time constraints, so the threshold must be derived manually after the calibration phase completes.

This project extends Transcendent by implementing a Non-Conformity Measure (NCM) based on Random Forest proximities, as introduced in the paper "Prediction with Confidence Based on a Random Forest Classifier".

Prerequisites

  • Make sure you have a running and active version of Docker.

Usage:

  1. Clone the repository and change directory:

    git clone git@github.com:w-disaster/transcendent-multiclass.git && cd transcendent-multiclass
    
  2. Set up docker-compose.yaml and the directory containing the training and testing sets:

    ice.py looks for the training and testing datasets, which should be mounted inside the Docker container. As default, docker-compose.yaml maps the local directory ./splitted_dataset/ inside the container. Also, two environment variables should be set: PE_DATASET_TYPE and TRAIN_TEST_SPLIT_TYPE, which allow to find the specific train/test split for a specific dataset. In other terms, splitted_dataset/ directory should follow this structure:

    splitted_dataset/
    ├── PE_DATASET_TYPE/
    |   ├── TRAIN_TEST_SPLIT_TYPE/
    │   │    ├── X_train.csv
    │   │    ├── y_train.csv
    │   │    ├── X_test.csv
    │   │    └── y_test.csv
    │   └──
    └── 
    

    So that you can configure the pipeline for different datasets and train/test splits. For example:

    splitted_dataset/
    ├── ember/
    |   ├── random_split/
    │   │    ├── X_train.csv
    │   │    ├── y_train.csv
    │   │    ├── X_test.csv
    │   │    └── y_test.csv
    │   └──
    │   ├── time_based/
    │   │    ├── X_train.csv
    │   │    ├── y_train.csv
    │   │    ├── X_test.csv
    │   │    └── y_test.csv
    │   └──
    └── 
    ├── motif/
    │   ...
    └── 
    
  3. Deploy the Concept Drift Pipeline

    A results/ directory will be locally created containing the credibility ($p$-values) and confidence scores for both calibration and testing sets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transcendent_multiclass_cdd_wdis-1.0.0.tar.gz (12.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file transcendent_multiclass_cdd_wdis-1.0.0.tar.gz.

File metadata

File hashes

Hashes for transcendent_multiclass_cdd_wdis-1.0.0.tar.gz
Algorithm Hash digest
SHA256 a988a0f9fad5abab773b95c7be300a5ad4940522db287e83aecc3d6b4b6f98a5
MD5 fb512fcb2895e69938c63d0292d4322c
BLAKE2b-256 b84b42c1167ce53cab0f3122c19782ef586cb706f0cb482dab2f23d24e222fd9

See more details on using hashes here.

File details

Details for the file transcendent_multiclass_cdd_wdis-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for transcendent_multiclass_cdd_wdis-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0f002af9be3d8dd68c522d8808c713ad8c4a10b640dedb2034f76d2e9f30813d
MD5 fe5af66064875df202b6a8ad37e7edda
BLAKE2b-256 7a7df74d049962d40266434c354b5214f44d5735a52cef95a2c155f867074c91

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page