A package to facilitate data-wrangling for APR tools

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Introduction

Pyrepair Benchmark Wrangling is a Python package designed to facilitate and streamline the process of the data-wrangling benchmarks for Automated Program Repair (APR) tools. This package provides an easy-to-use command-line interface to interact with two main components:

BugsInPy - a package to run BugsInPy benchmark (https://dl.acm.org/doi/abs/10.1145/3368089.3417943)
diff_utils - a set of utilities for handling diffs

Features

lmeasures: A command-line tool to compute and report various metrics and measures for the benchmarks.
bgp: A command-line interface to interact with the BugsInPy benchmark suite.
sample_bip: A utility to sample bugs from the BugsInPy benchmark suite.
run_custom_patch: A tool to apply custom patches to the bugs in the BugsInPy dataset.
diff_utils: A command-line utility to handle diff files and changes.

Installation

You can either directly install Pyr Benchmark Wrangling via pip, or Docker. After cloning the repository, switch to pyr_benchmark_wrangling:

cd pyr_benchmark_wrangling

Direct Installation

System Requirements:

Before using the Pyr Benchmark Wrangling, make sure your system meets the following system requirements:

Python 3.7 and Python 3.8
Development packages for Python 3.7 and Python 3.8
libffi7 library

On a Debian-based system, you can install these requirements using apt-get:

sudo apt-get install python3.7 python3.7-dev python3.8 python3.8-dev libffi7

Pip Command:

You can install Pyr Benchmark Wrangling by running the following command:

pip install .

Using Docker

Pyr Benchmark Wrangling's Docker Space Requirements:

Lite image: 2.8 GB
Full image: 20 GB

The difference between lite and full image is that the virtual environments are lazily constructed in lite, and downloaded in full.

To build the lite image (2.8 GB), use the following command:

docker build --target lite -t pyr:lite .

This will execute all instructions in the Dockerfile up until the lite stage is complete. The lite image automatically runs update_bug_records and clone's all repositories

To build the full image (20 GB), use the following command:

docker build --target full -t pyr:full .

This will execute all instructions in the Dockerfile. The full image automatically runs update_bug_records, clone's all repositories, installs all required environments.

BugsInPy CLI

The BugsInPy CLI is a command-line tool designed for interacting with and running Python bugs from the BugsInPy dataset. This script streamlines the process of setting up bug repositories, cloning specific bugs or repositories, preparing the environment, running tests, and more. Below, you'll find an overview of the available commands and their functionalities: This tool requires Python3.10 and above

Usage

`setup` Command

The setup command is used to set up the BugsInPy repository. This step is essential before working with any bugs. It clones the BugsInPy repository to your local system.

bgp setup

`clone` Command

The clone command allows you to clone specific bugs or repositories based on your requirements. You can specify the bugs to clone using the --bug_list flag or repositories using the --repo_list flag.

Example:

bgp clone --bugids repo1:id1,repo2:id2,...,repo3:id3

`checkout_buggy` and `checkout_fixed` Commands

These commands are used to checkout the buggy or fixed version of a specific bug repository. You provide the bug ID to identify the repository.

To checkout the buggy version:

bgp checkout_buggy --bugids repo:<bug_id>

To checkout the fixed version:

bgp checkout_fixed --bugids repo:<bug_id>

`extract_features` Command

The extract_features command extracts features of a specific bug.

Example:

bgp extract_features --bugids repo:<bug_id>

`prep` Command

The prep command prepares the environment for a specific bug. It installs the required dependencies and performs sanity checks to ensure the bug can be tested successfully. The commands setup and clone should be run before prep

Example:

bgp prep --bugids repo:<bug_id>

`run_test` Command

The run_test command runs the tests for a specific list of bugs. It executes the test commands associated with the bug. The commands setup, clone and prep should be run before run_test.

bgp run_test --bugids repo:<bug_id>

`delete_bug_repo` Command

The delete_bug_repo command deletes a specific bug repository from your local system.

Example:

bgp delete_bug_repo <bug_id>

Additional Notes

The CLI provides options to control the verbosity of the prep step and set the log level.
Mutually exclusive flags such as --bug_list, --repo_list are available for listing bugs/repos on which the commands should be run.
You can adjust the timeout for various system calls using the --timeout flag.

Unsupported repos

The following repositories are un-supported:

Spacy: Due to the requirement of python version < "3.4"

diff_utils

diff_utils is a Python module for analyzing and extracting data from unified diff outputs generated by tools such as Git. The module provides functionalities to compute localization measures on single file diffs, across multiple file diffs, and extract the modified line numbers and file names from diffs.

Features

Compute hunk statistics such as count, gaps, and spans from a single file diff.
Aggregate hunk information across multiple file diffs to calculate comprehensive statistics.
Extract modified files and their respective line changes from a unified diff.
Write the extracted data to CSV files for further analysis.

Usage

The diff_utils module provides a set of functions that can be used independently or through a command-line interface.

Command-line Interface

The module can be run as a script to perform actions based on the arguments provided:

--measure: Accepts comma-delimited list of diff files to measure localization metrics.
--locations: Accepts comma-delimited list of diff files from which to extract location sets.
--quiet: Quiet mode, which suppresses the standard output.
--output: Specifies the output CSV file name.

Example Usage:

> diff_utils --measure "diff_file1.txt,diff_file2.txt" --output "measures_output.csv"
> diff_utils --locations "diff_file1.txt,diff_file2.txt" --output "locations_output.csv"

Module Functions

You can also use the functions provided by diff_utils in a Python script:

from diffutils import measure_localisation_diff_file, locations_from_diff_file

# Measure localization metrics for a given diff file
metrics = measure_localisation_diff_file("diff_file.txt")

# Extract locations from a diff file
locations = locations_from_diff_file("diff_file.txt")

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.0.11

Mar 12, 2024

0.0.10

Mar 11, 2024

0.0.9

Mar 10, 2024

0.0.8

Feb 29, 2024

0.0.7

Feb 29, 2024

0.0.6

Feb 29, 2024

0.0.5

Feb 21, 2024

0.0.4

Jan 27, 2024

0.0.3

Jan 27, 2024

0.0.2

Jan 17, 2024

0.0.1

Jan 17, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyr_benchmark_wrangling-0.0.11.tar.gz (49.2 kB view hashes)

Uploaded Mar 12, 2024 Source

Built Distribution

pyr_benchmark_wrangling-0.0.11-py3-none-any.whl (52.2 kB view hashes)

Uploaded Mar 12, 2024 Python 3

Hashes for pyr_benchmark_wrangling-0.0.11.tar.gz

Hashes for pyr_benchmark_wrangling-0.0.11.tar.gz
Algorithm	Hash digest
SHA256	`a5357af208d17d7668343c04f128d7ae91135ab9b526acbf635ceca7148b4b51`
MD5	`98eba70b9f2c3851f9ba785adb00d10c`
BLAKE2b-256	`67cf2c1240e5a8d0207fa12c28b38e0cbd5e4c551bbe9cfd935ddfb7c3071b60`

Hashes for pyr_benchmark_wrangling-0.0.11-py3-none-any.whl

Hashes for pyr_benchmark_wrangling-0.0.11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1746ee27fea452fd7bda5cdc709825e437c0a3e0e43504d3ce677da58b43b886`
MD5	`757e60a24125787fe42161bbef70dcce`
BLAKE2b-256	`c96a1fd513e6d19c919013ca7ac6eb5ce0298da744043181f98686769a466043`

pyr-benchmark-wrangling 0.0.11

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Introduction

Features

Installation

Direct Installation

System Requirements:

Pip Command:

Using Docker

BugsInPy CLI

Usage

setup Command

clone Command

checkout_buggy and checkout_fixed Commands

extract_features Command

prep Command

run_test Command

delete_bug_repo Command

Additional Notes

Unsupported repos

diff_utils

Features

Usage

Command-line Interface

Module Functions

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

`setup` Command

`clone` Command

`checkout_buggy` and `checkout_fixed` Commands

`extract_features` Command

`prep` Command

`run_test` Command

`delete_bug_repo` Command