Skip to main content

DivExplorer

Project description

Latest PyPI version

DivExplorer

Usage

Example in notebooks

>>> from divexplorer.FP_DivergenceExplorer import FP_DivergenceExplorer
>>> from divexplorer.FP_Divergence import FP_Divergence
>>>
>>> min_sup=0.1
>>> # Input: # discretized dataframe, true class (str - column name), predicted class  (opt) (str - column name)
>>> #Extract frequent patterns (FP) and compute divergence (default metric of interest: False Positive Rate (FPR), False Negative Rate (FNR), Accuracy)
>>> fp_diver=FP_DivergenceExplorer(X_discretized, "class", "predicted", class_map=class_map, dataset_name=dataset_name)
>>> #Minimum support: frequency threshold for frequent pattern extraction and divergence estimation
>>> FP_fm=fp_diver.getFrequentPatternDivergence(min_support=min_sup)
>>>
>>> # If input just one class --> positive rate and negative rate as metric of interest (d_posr and d_negr)
>>> min_sup=0.1
>>> fp_diver_1cl=FP_DivergenceExplorer(X_discretized.drop(columns="predicted"),"class", class_map=class_map, dataset_name=dataset_name)
>>> FP_fm_1cl=fp_diver_1cl.getFrequentPatternDivergence(min_support=min_sup, metrics=["d_posr", "d_negr"])

The output is a pandas dataframe. Each row is a FP with classification info (e.g. TP/FP/FN/TN, FPR, FNR etc) and its divergence.

We can then analyze the divergence of FP w.r.t. a metric of interest (e.g. FPR).

>>> fp_divergence_fpr=FP_Divergence(FP_fm, "d_fpr")
>>> #FP sorted for their divergence:
>>> fp_divergence_fpr.getDivergence(th_redundancy=0)
>>> #TopK divergent patterns:
>>> #TopK FP and their divergence (dict)
>>> fp_divergence_fpr.getDivergenceTopK(K=5, th_redundancy=0)
>>> #DF format with all info
>>> fp_divergence_fpr.getDivergenceTopKDf(K=10, th_redundancy=0)
>>> #Compute Shapley values
>>> #Let be itemset a Frequent Pattern of interest
>>> itemset_shap=fp_divergence_fpr.computeShapleyValue(itemset)
>>> #Plot shapley values
>>> fp_divergence_fpr.plotShapleyValue(shapley_values=itemset_shap)
>>> #Alternatives
>>> fp_divergence_fpr.plotShapleyValue(itemset=itemset)
>>>
>>> #Plot the lattice graph
>>> #Th_divergence: if specified, itemsets of the lattice with divergence greater than specified value are highlighted in magenta/squares
>>> Get lower: if True, corrective patterns are highlighted in light blue/diamonds
>>> fig=fp_divergence_fpr.plotLatticeItemset(itemset, Th_divergence=0.15, sizeDot="small", getLower=True)
>>> #Corrective items
>>> fp_divergence_fpr.getCorrectiveItems()
>>>
>>> #Compute global shapley value
>>> u_h_fpr=fp_divergence_fpr.computeGlobalShapleyValue()
>>> fp_divergence_fpr.plotShapleyValue(shapley_values=u_h_fpr)

Installation

Requirements

Compatibility

Licence

Authors

divexplorer was written by Eliana Pastor.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

divexplorer-0.1.1.tar.gz (27.6 kB view hashes)

Uploaded Source

Built Distribution

divexplorer-0.1.1-py3-none-any.whl (28.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page