A metacognitive error detection and correction framework

These details have not been verified by PyPI

Project description

Logo

What is this?
Example

What is this?

PyEDCR is a Python implementation of the f-EDR (Focused Error Detection Rules) paradigm. The goal of EDR is to use a set of conditions to learn when a machine learning model makes an incorrect prediction.

The EDCR method was first introduced in 'Rule-Based Error Detection and Correction to Operationalize Movement Trajectory Classification' (ArXiv preprint) and later extended to the f-EDR method in the conference article 'Error Detection and Constraint Recovery in Hierarchical Multi-Label Classification without Prior Knowledge' presented at CIKM 2024 (ACM Publication, ArXiv Preprint).

The package was tested for Python >= 3.9.

If you use this work, please cite our paper:

@inproceedings{10.1145/3627673.3679918,
author = {Kricheli, Joshua Shay and Vo, Khoa and Datta, Aniruddha and Ozgur, Spencer and Shakarian, Paulo},
title = {Error Detection and Constraint Recovery in Hierarchical Multi-Label Classification without Prior Knowledge},
year = {2024},
isbn = {9798400704369},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3627673.3679918},
doi = {10.1145/3627673.3679918},
abstract = {Recent advances in Hierarchical Multi-label Classification (HMC), particularly neurosymbolic-based approaches, have demonstrated improved consistency and accuracy by enforcing constraints on a neural model during training. However, such work assumes the existence of such constraints a-priori. In this paper, we relax this strong assumption and present an approach based on Error Detection Rules (EDR) that allow for learning explainable rules about the failure modes of machine learning models. We show that these rules are not only effective in detecting when a machine learning classifier has made an error but also can be leveraged as constraints for HMC, thereby allowing the recovery of explainable constraints even if they are not provided. We show that our approach is effective in detecting machine learning errors and recovering constraints, is noise tolerant, and can function as a source of knowledge for neurosymbolic models on multiple datasets, including a newly introduced military vehicle recognition dataset.},
booktitle = {Proceedings of the 33rd ACM International Conference on Information and Knowledge Management},
pages = {3842–3846},
numpages = {5},
keywords = {hierarchical multi-label classification, learning with constraints, metacognitive ai, neurosymbolic ai, rule learning},
location = {Boise, ID, USA},
series = {CIKM '24}
}

Example

To demonstrate the use of the package, we consider a dataset with two levels of hierarchy, such that each image has a fine-grain and coarse-grain label. For example, consider the following example from our curated Military Vehicle (which can be found here):

ImageNet100

We further consider a pretrained 'main' model, for example one which employed the small version of Meta's DINO_V2 architecture and was fine-tuned on ImageNet50 - a subset of the ImageNet1K dataset with 50 classes (which can be found here), which we want to analyze its ability to classify both levels of the hierarchy. An instance of such model (which can be found here) has the following performance:

Fine-grain prior combined accuracy: 76.57% , fine-grain prior combined macro f1: 76.1%
Fine-grain prior combined macro precision: 76.96% , fine-grain prior combined macro recall: 76.57%

Coarse-grain prior combined accuracy: 87.14%, coarse-grain prior combined macro f1: 85.77%
Coarse-grain prior combined macro precision: 87.36%, coarse-grain prior combined macro recall: 84.64%

Total prior inconsistencies 133/2100 (6.33%)

We also consider a 'secondary' model (which can be found here), which employed the large version of the DINO_V2 architecture and was also fine-tuned on the ImageNet50 dataset, along with binary models which were trained on each class of the dataset. Consider the following code snippet to run the run_experiment function from PyEDCR.py:

from PyEDCR.classes import experiment_config
from PyEDCR.PyEDCR import run_experiment

imagenet_config = experiment_config.ExperimentConfig(
    data_str='imagenet',
    main_model_name='dinov2_vits14',
    secondary_model_name='dinov2_vitl14',
    main_lr=1e-6,
    secondary_lr=0.000001,
    binary_lr=0.000001,
    original_num_epochs=8,
    secondary_num_epochs=2,
    binary_num_epochs=5
)

run_experiment(config=imagenet_config)

The code will initiate the rule learning pipeline, use the rules learned to mark errors in the predictions of the main model, and print out the performance metrics of the algorithm on the error class after running the f-EDR algorithm, which in this case will be:

error_accuracy: 89.0
error_balanced_accuracy: 84.23
error_precision: 81.65
error_recall: 74.31
error_f1: 77.81
recovered_constraints_precision: 100.0
recovered_constraints_recall: 59.36
recovered_constraints_f1_score: 74.5

For further details about the rule learning algorithm, and noise tolerance experiments, please refer to the paper.

Acknowledgments

This research was funded by ARO grant W911NF-24-1-0007.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.1.3

Feb 19, 2025

1.1.2

Feb 19, 2025

1.1.1

Feb 19, 2025

This version

1.1.0

Feb 19, 2025

0.1.31

Feb 4, 2024

0.1.30

Feb 4, 2024

0.1.29

Feb 4, 2024

0.1.28

Feb 4, 2024

0.1.27

Feb 4, 2024

0.1.26

Feb 3, 2024

0.1.25

Feb 2, 2024

0.1.24

Feb 2, 2024

0.1.23

Jan 28, 2024

0.1.22

Jan 28, 2024

0.1.21

Jan 28, 2024

0.1.20

Jan 28, 2024

0.1.19

Jan 28, 2024

0.1.18

Jan 28, 2024

0.1.17

Jan 28, 2024

0.1.16

Jan 27, 2024

0.1.15

Jan 27, 2024

0.1.14

Jan 27, 2024

0.1.13

Jan 27, 2024

0.1.12

Jan 27, 2024

0.1.11

Jan 27, 2024

0.1.10

Jan 27, 2024

0.1.9

Jan 27, 2024

0.1.8

Jan 27, 2024

0.1.7

Jan 27, 2024

0.1.6

Jan 27, 2024

0.1.5

Jan 27, 2024

0.1.4

Jan 27, 2024

0.1.3

Jan 27, 2024

0.1.2

Jan 27, 2024

0.1.1

Jan 27, 2024

0.1.0

Jan 27, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyedcr-1.1.0.tar.gz (43.4 MB view details)

Uploaded Feb 19, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyedcr-1.1.0-py3-none-any.whl (4.7 kB view details)

Uploaded Feb 19, 2025 Python 3

File details

Details for the file pyedcr-1.1.0.tar.gz.

File metadata

Download URL: pyedcr-1.1.0.tar.gz
Upload date: Feb 19, 2025
Size: 43.4 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pyedcr-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e33e4af9c48f5288ddc992a92a600da8a24404967d4029107d101755f6514e7f`
MD5	`ada12ad7d73075062104cdcb16a76147`
BLAKE2b-256	`bb565cec711dd41d9f9c83d8d1d1dcf894700a491e5288c0da3bdde6111b8620`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyedcr-1.1.0.tar.gz:

Publisher: python-publish.yml on lab-v2/PyEDCR

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pyedcr-1.1.0.tar.gz
- Subject digest: e33e4af9c48f5288ddc992a92a600da8a24404967d4029107d101755f6514e7f
- Sigstore transparency entry: 172350588
- Sigstore integration time: Feb 19, 2025
Source repository:
- Permalink: lab-v2/PyEDCR@f66b702386a292a5be5e105433008bce55732f7a
- Branch / Tag: refs/tags/v1.1.0
- Owner: https://github.com/lab-v2
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@f66b702386a292a5be5e105433008bce55732f7a
- Trigger Event: release

File details

Details for the file pyedcr-1.1.0-py3-none-any.whl.

File metadata

Download URL: pyedcr-1.1.0-py3-none-any.whl
Upload date: Feb 19, 2025
Size: 4.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pyedcr-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6e58cb9ec2b411d7c45db5e0267fe05e5c4dcfc02521dab3139be47c31cd10bc`
MD5	`0982b1d75129d5ea856265a69cd2cffa`
BLAKE2b-256	`e7c0039a64da13de77443d688e7e1071a40c0fe66e01e4b870fbfce271d86457`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyedcr-1.1.0-py3-none-any.whl:

Publisher: python-publish.yml on lab-v2/PyEDCR

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pyedcr-1.1.0-py3-none-any.whl
- Subject digest: 6e58cb9ec2b411d7c45db5e0267fe05e5c4dcfc02521dab3139be47c31cd10bc
- Sigstore transparency entry: 172350589
- Sigstore integration time: Feb 19, 2025
Source repository:
- Permalink: lab-v2/PyEDCR@f66b702386a292a5be5e105433008bce55732f7a
- Branch / Tag: refs/tags/v1.1.0
- Owner: https://github.com/lab-v2
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@f66b702386a292a5be5e105433008bce55732f7a
- Trigger Event: release

PyEDCR 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

What is this?

Example

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance