Execution-Feature-Driven Debugging

These details have not been verified by PyPI

Intended Audience
- Science/Research
Operating System
- OS Independent
Programming Language
- Python :: 3.10
Topic
- Scientific/Engineering
- Software Development :: Testing

Project description

Execution-Feature-Driven Debugging

Abstract

Fault localization is a fundamental aspect of debugging, aiming to identify code regions likely responsible for failures. Traditional techniques primarily correlate statement execution with failures, yet program behavior is influenced by diverse execution features—such as variable values, branch conditions, and definition-use pairs—that can provide richer diagnostic insights.

In an empirical study of 310 bugs across 20 projects, we analyzed 17 execution features and assessed their correlation with failure outcomes. Our findings suggest that fault localization benefits from a broader range of execution features: (1) Scalar pairs exhibit the strongest correlation with failures; (2) Beyond line executions, def-use pairs and functions executed are key indicators for fault localization; and (3) Combining multiple features enhances effectiveness compared to relying solely on individual features.

Building on these insights, we introduce a debugging approach to diagnose failure circumstances. The approach extracts fine-grained execution features and trains a decision tree to differentiate passing and failing runs. From this model, we derive a diagnosis that pinpoints faulty locations and explains the underlying causes of the failure.

Our evaluation demonstrates that the generated diagnoses achieve high predictive accuracy, reinforcing their reliability. These interpretable diagnoses empower developers to efficiently debug software by providing deeper insights into failure causes.

Study

Setup

We leverage SFLKit to collect the event data for the subjects. SFLKit is a tool that instruments the subject programs to collect the event data. The event data is a sequence of events that occur during the execution of the subject.

As subjects of our empirical study, we leverage Tests4Py.

The study is located in the study directory. Additionally, we have implemented a script, study.py, to run the experiments and analyze the results.

Installing Requirements

To install the requirements, run the following command inside the study directory:

python -m pip install -r requirements.txt

We recommend using a virtual environment to install the requirements. To create a virtual environment, run the following command:

python -m venv .venv

and to activate the virtual environment, run the following command:

. .venv/bin/activate

source .venv/bin/activate

Getting the Data Set

To get the data set, please download the data set from here and extract it to the study directory.

You can also reproduce the data by following the next section.

Reproducing the Data Set

Collecting The Event Data

To collect the event data, run the following command:

python study.py event -p <project_name> [-i <bug_id>]

For instance, to collect the event data for bug 1 of the project black, run the following command:

python3 get_events.py -p black -i 1

The collected event data will be stored in the sflkit_events directory. Additionally, this script maps all possible events for the subjects and stores them in mappings/<project_name>_<bug_id>.json.

So the collected events and mapping of the black project and bug 1 will be stored in sflkit_events/black/1/bug for the buggy version, sflkit_events/black/1/fix for the fixed version, and mappings/black_1.json for the mapping.

Remove the report_<project_name>.json file if you want to collect the event data from scratch.

Evaluating the Correlation and Fault Localization

To evaluate the correlation and fault localization, run the following command:

python study.py evaluate -p <project_name> [-i <bug_id>]

This script will evaluate the correlation of the execution features with the failure and the fault localization. This script generates the features and their values in the analysis directory as an intermediate step. The following command can explicitly run this step:

python study.py analyze -p <project_name> [-i <bug_id>]

The results of the correlation and fault localization will be stored in the results directory for each subject individually as a JSON file with the name <project_name>_<bug_id>.json.

If you want to evaluate the correlation and fault localization from scratch, you need to remove the corresponding files in the results directory.

To summarize the results of all subjects, run the following command:

python study.py summarize

The summarized results will be stored in a file called summary.json.

Execution-Feature-Driven Debugging

Installation

To install __E__xecution-__F__eature-__D__riven __D__ebugging (EFDD), run the following command:

python -m pip install .

Usage

For EFDD, you need to instrument your subject.

from efdd.events import instrument

instrument("middle.py", "tmp.py", "mapping.json")

Next, you need some tests to execute and collect their event traces. We provide two collectors, one for unit tests and one for input to the program. However, implementing another collector by inheriting the base class EventCollector and implementing its collect() method is an option. To employ the collector, use it like this:

from efdd.events import SystemtestEventCollector

collector = SystemtestEventCollector(os.getcwd(), "middle.py", "tmp.py", mapping_path="mapping.json")
events = collector.get_events((passing, failing))

In this example, we leverage the input event collector. passing and failing are lists of passing and failing inputs.

Next, you can utilize the event handler to extract and build feature vectors from the event traces.

from sflkit.features.handler import EventHandler

handler = EventHandler()
handler.handle_files(events)

Now, we can leverage EFDD learning to infer a failure diagnosis.

from efdd.learning import DecisionTreeDiagnosis

debugger = DecisionTreeDiagnosis()
debugger.fit(
    handler.builder.get_all_features(),
    handler,
)

Now, we can leverage the underlying model of the debugger as a diagnosis that pinpoints faulty locations and explains the underlying causes of the failure.

We provide an example of this walk-through in evaluation/example.ipynb.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Intended Audience
- Science/Research
Operating System
- OS Independent
Programming Language
- Python :: 3.10
Topic
- Scientific/Engineering
- Software Development :: Testing

Release history Release notifications | RSS feed

This version

0.0.3

Nov 20, 2025

0.0.2

Nov 20, 2025

0.0.1

Nov 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

efdd-0.0.3.tar.gz (10.3 kB view details)

Uploaded Nov 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

efdd-0.0.3-py3-none-any.whl (8.5 kB view details)

Uploaded Nov 20, 2025 Python 3

File details

Details for the file efdd-0.0.3.tar.gz.

File metadata

Download URL: efdd-0.0.3.tar.gz
Upload date: Nov 20, 2025
Size: 10.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for efdd-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`9379b1444ce90ab8d13b8c506b215adf9384d6c0ab791551b709198fae50af66`
MD5	`123c888a7942a0ea1b13a05c80bb21d6`
BLAKE2b-256	`7e03f85e1d81dcdc6bd4f2cb92487cc66b119a1d9f003e1e86c714ea3793e276`

See more details on using hashes here.

Provenance

The following attestation bundles were made for efdd-0.0.3.tar.gz:

Publisher: release.yml on smythi93/efdd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: efdd-0.0.3.tar.gz
- Subject digest: 9379b1444ce90ab8d13b8c506b215adf9384d6c0ab791551b709198fae50af66
- Sigstore transparency entry: 709520005
- Sigstore integration time: Nov 20, 2025
Source repository:
- Permalink: smythi93/efdd@c8086bce5895da23806d33c369282addc56ba7a9
- Branch / Tag: refs/heads/release
- Owner: https://github.com/smythi93
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@c8086bce5895da23806d33c369282addc56ba7a9
- Trigger Event: push

File details

Details for the file efdd-0.0.3-py3-none-any.whl.

File metadata

Download URL: efdd-0.0.3-py3-none-any.whl
Upload date: Nov 20, 2025
Size: 8.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for efdd-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3febfb4646bcc099d0ba60bdbe64fb752c3cf60f124f591749b971a5b5139cec`
MD5	`06630fb2a3833a543586949f06dbdb28`
BLAKE2b-256	`a7819747ce779d3f31fa83ba2513b4f6ceec22336a8f08e8ca94b260d5bf1e8b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for efdd-0.0.3-py3-none-any.whl:

Publisher: release.yml on smythi93/efdd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: efdd-0.0.3-py3-none-any.whl
- Subject digest: 3febfb4646bcc099d0ba60bdbe64fb752c3cf60f124f591749b971a5b5139cec
- Sigstore transparency entry: 709520027
- Sigstore integration time: Nov 20, 2025
Source repository:
- Permalink: smythi93/efdd@c8086bce5895da23806d33c369282addc56ba7a9
- Branch / Tag: refs/heads/release
- Owner: https://github.com/smythi93
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@c8086bce5895da23806d33c369282addc56ba7a9
- Trigger Event: push

efdd 0.0.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Execution-Feature-Driven Debugging

Abstract

Study

Setup

Installing Requirements

Getting the Data Set

Reproducing the Data Set

Collecting The Event Data

Evaluating the Correlation and Fault Localization

Execution-Feature-Driven Debugging

Installation

Usage

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance