Skip to main content

Given a bug report, suggests code files that may need to be fixed

Project description

README

PyConCodeSe is a bug localisation tool: given a bug report it suggests 10 code files that may contain the bug.

For PyConCodeSe to work it needs a Tree-sitter grammar to parse your code (we currently support Java, C#, PHP and Rust) and the spaCy library to parse the bug report and any natural language text in the code.

After installing the software as per the following instructions, you will need to edit a configuration file for each code repo. The configuration tells PyConCodeSe where your code and your bug reports are.

Installation

You need at least Python 3.8 to run PyConCodeSe and we strongly recommend you install it in a virtual environment. On Unix (incl. macOS) you would type in a terminal something like:

python3.11 -m venv venv
source venv/bin/activate

to create and activate a virtual environment in subfolder venv of your current folder. Next, type

pip install -i py-concodese

to install everything in the current environment and create the main executable script pyconcodese. Finally, set up spaCy with

python -m spacy download en_core_web_sm

Now you can run pyconcodese from any folder, as long as you have the virtual environment activated.

Configuration

When running PyConCodeSe, if there's no file config.toml in the current folder, you're prompted to create one with example values.

pyconcodese 

The PyConcodese config file 'config.toml' was not found in this path. 
 Do you want to copy an example file here? [y/N]y
Example config.toml file copied in the current directory.  
 Please make sure you change the fields accordingly

Open the configuration file and edit the paths of the various folders:

[py-concodese]
# the path that contains grammars (e.g. the folder that contains "tree-sitter-c-master")
# If you don't have a grammar fplder, Set up an empty folder here and Pyconcodese will download the grammars from Github
grammar_path = "/tmp/Pyconcodese/Grammars"

# the next 3 can be the root directory of the application if you wish (e.g.)
# the directory that you would like to store the sqlite database(s) in
sqlite_path = "/tmp/Pyconcodese"
# the directory that you would like to store the vsm data folders in
vsm_path = "/tmp/Pyconcodese"
# where to store output files generated by the application
output_path = "/tmp/Pyconcodese"

derby_path = "/tmp/Pyconcodese/derby"

The grammar_path is the folder where the Tree-sitter grammar files will be put. When running pyconcodese, you will be asked if you want to update the grammars: this will automatically download the latest version from GitHub to the grammar_path folder. You can share the grammar folder between projects, but make sure the grammar_path you select is an empty folder when running PyConCodeSe for the first time. The grammar update will fail if the folder already contains files that are not git-versioned but have the same name as the ones on GitHub.

[dataset]
bug_repository_file = "/tmp/Pyconcodese/bug_repository_file.xml"
src_path = "/tmp/Pyconcodese"

The config.toml file will be created with the current path prefilled as a starting point for all these values.

Dataset format

The dataset (bug reports) that you want to use should be in a particular format. Necessary fields that should be present in the dataset are: issue_id, issue_summary, issue_description, issue_status and files_changed.

The following examples show the correct format in which the dataset should be aligned:

Json version

{
    "closed_issues": {
        "1": {
            "issue_id": "#3085",
            "issue_summary": "Missing PR_SET_PTRACER_ANY",
            "issue_description": "JonathanWoollett-Light…..search=PR_SET_PTRACER_ANY).",
            "issue_status": "Closed",
            "files_changed": [
                [
                    "1",
                    "libc-test/semver/fuchsia.txt"
                ],
                [
                    "1",
                    "libc-test/semver/linux.txt"
                ],
                [ 
                    "1", 
                    "src/fuchsia/mod.rs" 
                ], 
                [ 
                    "1", 
                    "src/unix/linux_like/emscripten/mod.rs" 
                ], 
                [ 
                    "1", 
                    "src/unix/linux_like/linux/mod.rs" 
                ] 
            ] 
        }, 
        "437": { 
            "issue_id": "#100", 
            "issue_summary": "Outdated MIPS toolchain", 
            "issue_description": "Contributor…..further.", 
            "issue_status": "Closed", 
            "issue_reporting_time": "", 
            "fixed_by": "#114", 
            "files_changed": [ 
                [ 
                    "1", 
                    "src/unix/notbsd/linux/mips.rs" 
                ], 
                [ 
                    "1", 
                    "src/unix/notbsd/linux/mod.rs" 
                ], 
                [ 
                    "1", 
                    "src/unix/notbsd/linux/musl.rs" 
                ], 
                [ 
                    "1", 
                    "src/unix/notbsd/linux/other/mod.rs" 
                ] 
            ] 
        } 
    } 
} 

XML version

<?xml version="1.0" encoding="ISO-8859-1"?>

<bugrepository name="SWT">
  <bug id="88829" opendate="2005-03-22 20:41:00" fixdate="2005-04-06 17:05:00">
    <buginformation>
      <summary>Table.setColumnOrder() may not fire enough Move events</summary>
      <description>- start with five columns, all different widths - do Table.setColumnOrder(new int[] {4,1,2,3,0}); - SWT.Move events are fired for columns 0 and 4 because they swapped positions -&amp;gt; but Move should have been fired for all of the columns since the width of the first displayed column changed, and therefore all of the other columns are auto-shifted accordingly</description>
    </buginformation>
    <fixedFiles>
      <file>org.eclipse.swt.widgets.Table.java</file>
    </fixedFiles>
  </bug>
  <bug id="90018" opendate="2005-04-01 14:40:00" fixdate="2005-04-05 08:14:00">
    <buginformation>
      <summary>Native tooltips left around on CTabFolder</summary>
      <description>Hover over the PartStack CTabFolder inside eclipse until some native tooltip is displayed. For example, the maximize button. When the tooltip appears, change perspectives using the keybinding. the CTabFolder gets hidden, but its tooltip is permanently displayed and never goes away. Even if that CTabFolder is disposed (I'm assuming) when the perspective is closed.</description>
    </buginformation>
    <fixedFiles>
      <file>org.eclipse.swt.custom.CTabFolder.java</file>
    </fixedFiles>
  </bug>
</bugrepository>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_concodese-1.1.2.tar.gz (14.8 MB view details)

Uploaded Source

Built Distribution

py_concodese-1.1.2-py3-none-any.whl (14.8 MB view details)

Uploaded Python 3

File details

Details for the file py_concodese-1.1.2.tar.gz.

File metadata

  • Download URL: py_concodese-1.1.2.tar.gz
  • Upload date:
  • Size: 14.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.2

File hashes

Hashes for py_concodese-1.1.2.tar.gz
Algorithm Hash digest
SHA256 184d5319a01086b191ae33280a5a94f635872b828ea0f783289cd253562cb161
MD5 1a15bb7bbb814a8ca8ba87ba8aa8a054
BLAKE2b-256 e7c22f346b2904d00e338d82a87bb6e1f2ee333d850124c3bd6ac9a61c259297

See more details on using hashes here.

File details

Details for the file py_concodese-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: py_concodese-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 14.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.2

File hashes

Hashes for py_concodese-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 96e7890695ad58f171818fc9e8bfbc473a7ad0458ee67f9263c369fca5ea664f
MD5 9015e95fd8c2617ce7388c123359896e
BLAKE2b-256 bae10408c8517faef8fc520021fbdd2e7da8762196db9cd17da62b4a72dc6424

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page