Skip to main content

Transcendent adaptation for multiclass problems

Project description

Transcendent Multiclass

CI status Version

This repository enables users to apply Transcendent-like concept drift detection to both binary and multiclass problems.

Modifications have been made specifically to the ICE (Inductive Conformal Evaluator) implementation, while other solutions (i.e. TCE, CCE, etc.) are out of scope. Furthermore, the thresholding phase is temporarily disabled due to time constraints, so the threshold must be derived manually after the calibration phase completes.

This project extends Transcendent by implementing a Non-Conformity Measure (NCM) based on Random Forest proximities, as introduced in the paper "Prediction with Confidence Based on a Random Forest Classifier".

Prerequisites

  • Make sure you have a running and active version of Docker.

Usage:

  1. Clone the repository and change directory:

    git clone git@github.com:w-disaster/transcendent-multiclass.git && cd transcendent-multiclass
    
  2. Set up docker-compose.yaml and the directory containing the training and testing sets:

    ice.py looks for the training and testing datasets, which should be mounted inside the Docker container. As default, docker-compose.yaml maps the local directory ./splitted_dataset/ inside the container. Also, two environment variables should be set: PE_DATASET_TYPE and TRAIN_TEST_SPLIT_TYPE, which allow to find the specific train/test split for a specific dataset. In other terms, splitted_dataset/ directory should follow this structure:

    splitted_dataset/
    ├── PE_DATASET_TYPE/
    |   ├── TRAIN_TEST_SPLIT_TYPE/
    │   │    ├── X_train.csv
    │   │    ├── y_train.csv
    │   │    ├── X_test.csv
    │   │    └── y_test.csv
    │   └──
    └── 
    

    So that you can configure the pipeline for different datasets and train/test splits. For example:

    splitted_dataset/
    ├── ember/
    |   ├── random_split/
    │   │    ├── X_train.csv
    │   │    ├── y_train.csv
    │   │    ├── X_test.csv
    │   │    └── y_test.csv
    │   └──
    │   ├── time_based/
    │   │    ├── X_train.csv
    │   │    ├── y_train.csv
    │   │    ├── X_test.csv
    │   │    └── y_test.csv
    │   └──
    └── 
    ├── motif/
    │   ...
    └── 
    
  3. Deploy the Concept Drift Pipeline

    A results/ directory will be locally created containing the credibility ($p$-values) and confidence scores for both calibration and testing sets.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transcendent_multiclass_cdd_wdis-1.1.0.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file transcendent_multiclass_cdd_wdis-1.1.0.tar.gz.

File metadata

File hashes

Hashes for transcendent_multiclass_cdd_wdis-1.1.0.tar.gz
Algorithm Hash digest
SHA256 74e283f929b71bb2c3b146d2867b29ad83c39d639a296712c9084df54f2d7816
MD5 bb90a767b48a57d053a22310cd38db40
BLAKE2b-256 94a45e8f35c4694a46ab610d2db9b629e0359fa7fcdbc10784b1933ff840d494

See more details on using hashes here.

File details

Details for the file transcendent_multiclass_cdd_wdis-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for transcendent_multiclass_cdd_wdis-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 975d73f798a68a8318a63e555af7f04a436d9621649880520236fd8047b03696
MD5 7e3d1db9bcdaf0a40de7ccea9aa72bb2
BLAKE2b-256 1a03f4c6624dc33f136a4dd7d4ad1d729df1fe1ef1ba02dbcf52d08f3f8d380c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page