Skip to main content

Explainable Naive Bayes (XNB) classifier. Using KDE for feature selection and Naive Bayes for prediction.

Project description

Explainable Class–Specific Naive–Bayes Classifier

Repositories Badges
Github PYPI Testing & Coverage. Codecov.io

Description

Explainable Naive Bayes (XNB) classifier includes two important features:

  1. The probability is calculated by means of Kernel Density Estimation (KDE).

  2. The probability for each class does not use all variables, but only those that are relevant for each specific class.

From the point of view of the classification performance, the XNB classifier is comparable to NB classifier. However, the XNB classifier provides the subsets of relevant variables for each class, which contributes considerably to explaining how the predictive model is performing. In addition, the subsets of variables generated for each class are usually different and with remarkably small cardinality.

Installation

For example, if you are using pip, yo can install the package by:

pip install xnb

Example of use:

from xnb import XNB
from xnb.enums import BWFunctionName, Kernel, Algorithm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_iris
import pandas as pd

''' 1. Read the dataset.
It is important that the dataset is a pandas DataFrame object with named columns.
This way, we can obtain the dictionary of important variables for each class.'''
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target' ] = iris.target
x = df.drop('target', axis=1)
y = df['target'].replace(
  to_replace= [0, 1, 2], value = ['setosa', 'versicolor', 'virginica']
)
x_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.20, random_state=0)

''' 2. By calling the fit() function,
we prepare the object to be able to make the prediction later. '''
xnb = XNB(
  show_progress_bar = True # optional
)
xnb.fit(
  x_train,
  y_train,
  bw_function = BWFunctionName.HSILVERMAN, # optional
  kernel = Kernel.GAUSSIAN, # optional
  algorithm = Algorithm.AUTO, # optional
  n_sample = 50 # optional
)

''' 3. When the fit() function finishes,
we can now access the feature selection dictionary it has calculated. '''
feature_selection = xnb.feature_selection_dict

''' 4. We predict the values of "y_test" using implicitly the calculated dictionary. '''
y_pred = xnb.predict(X_test)

# Output
print('Relevant features for each class:\n')
for target, features in feature_selection.items():
  print(f'{target}: {features}')
print(f'\n-------------\nAccuracy: {accuracy_score(y_test, y_pred)}')

The output is:

Relevant features for each class:

setosa: {'petal length (cm)'}
virginica: {'petal length (cm)', 'petal width (cm)'}
versicolor: {'petal length (cm)', 'petal width (cm)'}

-------------
Accuracy: 1.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xnb-0.2.3.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

xnb-0.2.3-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file xnb-0.2.3.tar.gz.

File metadata

  • Download URL: xnb-0.2.3.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for xnb-0.2.3.tar.gz
Algorithm Hash digest
SHA256 0dd9c792887ba907bb46175ff35263cee27facba41b548dc545df5d09d5b89a7
MD5 d455c4ea5e09bc1ed670f7f23497ba05
BLAKE2b-256 79dc5eec587a638f06cb54d97b6ed21731f5ed37bc8d585fc16000c2144136c3

See more details on using hashes here.

File details

Details for the file xnb-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: xnb-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for xnb-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 76a155e6793accaa8a3990a4cfac229086da4c76412a5cf4e52098dddd4a2fee
MD5 4b103948f513bceb7211e4d0aa8326b4
BLAKE2b-256 aa736f3cbf743f9b102a53c3ed4a2195e5f4f767a263347f373f91be45d03906

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page