Skip to main content

Feature selection using XAI

Project description

Advanced feature selection using explainable Artificial Intelligence (XAI)

Developed by Yaganteeswarudu Akkem , Data scientist , Ph.D. Scholar , NIT Silchar

Introduction

In the rapidly evolving field of machine learning, the complexity of models is ever-increasing, necessitating sophisticated feature selection techniques to enhance predictive performance and shed light on the decision-making processes. This study presents an innovative architecture that synergizes the global explanation capabilities of SHAP (SHapley Additive exPlanations) with the local interpretability provided by LIME (Local Interpretable Model-agnostic Explanations) to advance the feature selection process.

Our proposed methodology harnesses the strengths of both SHAP and LIME, systematically identifying features that wield consistent influence across the entire dataset as well as those vital to individual predictions. By normalizing SHAP values to derive feature weights and integrating these with LIME scores, we formulate a maximum interpretation score for each feature. This hybrid framework offers a refined and nuanced approach to feature selection, adeptly balancing the pursuit of model simplicity with the demands for high predictive accuracy and interpretability. The architecture not only promises substantial enhancements in computational efficiency and model performance but also holds significant promise for applications where model transparency and decision-making understanding are critical.

Examples of How To Use Feature selection

Install package by using below syntax

pip install xai-feature-selection==0.4

Consume package by using below syntax

from xai_feature_selection.feature_selection import FeatureSelect

from xai_feature_selection.model_prediction import Model

Currently xai_feature_selection built to work for classification and regression problems

Use below algorithms to test regession

  1. LinearRegression

  2. RandomForestRegressor

Use below algorithm for classification

  1. LogisticRegression

Below is the syntax to retrieve best features after calculating feature importance

file_path: location of csv file in your system

predict_columns : in classification or regression , column which is going to be predicted

model_type_choice :


0 - Regression  ,  1 - Classification



model_choice :


               For regression 

               0 - LinearRegression

               1 - RandomForestRegressor

               for classification 

               0 - LogisticRegression



Once all parameters choosen , simply use below syntax to call Model , to calculate LIME and SHAP values and finally Feature select method will return important features


                if predict_columns and file_path:

                    model = Model(

                        model_type=model_type_choice,

                        model_choice=model_choice,

                        data_file_path=file_path,

                        predict_columns=predict_columns,

                    )

                    model.train()

                    lime_data, shap_data = model.explain()

                    feature_handler = FeatureSelect(

                        shap_data=shap_data, lime_data=lime_data

                    )

                    feature_handler.prepare_weights()

                    feature_handler.calculate_feature_values()

                    feature_handler.get_best_feature_data()

                    print(feature_handler.get_best_feature_data())



Note :

Its very important if you pass more appropriate pre-processed data ( without null values , outliers and so on ) , you will expect more better features from algorithm

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xai_feature_selection-0.6.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

xai_feature_selection-0.6-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file xai_feature_selection-0.6.tar.gz.

File metadata

  • Download URL: xai_feature_selection-0.6.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.11

File hashes

Hashes for xai_feature_selection-0.6.tar.gz
Algorithm Hash digest
SHA256 4ee5df2a1cb92c7f0c381ebf7c28e84af21378b8fba8ea4c5ea1390237f102bc
MD5 91364fe7bc1228c088556d795d2d9afb
BLAKE2b-256 7d23100495bdf698e35bd8f1d6e011e676a9b3d2d1055fa3235cfe4387c9fe8c

See more details on using hashes here.

File details

Details for the file xai_feature_selection-0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for xai_feature_selection-0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 d200591f44101e359aefbcf0c56fea7db6db7156847d588da739a78e77c87744
MD5 4ccdfac6c16a8ee2b497ad9af7e75495
BLAKE2b-256 49433c88db32d53000068f58e821ae380ca95e00545b2ad5c4c4da802a5ce1fd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page