Skip to main content

Python Library for Graph Program Analysis

Project description

GPA

badge1 badge2 badge3 badge4

Graph Program Analysis Library in Python.

GPA package aims to provide better functions for researchers to analyze the code graphs.

Installation

First, install the pre-required dependencies.

pip install -r requirements.tx

Then install our graph program analysis package.

pip install GPA

Usage

For example, the following BinarySearch code can be represented as a graph using tools like joern, then we can analyze the code graph via GPA.

int BinarySearch(int x, int a[], int length) {
    int L = 0;
    int R = length - 1;
    do {
        int m = (L + R) / 2;
        if (x == a[n]){
            return m;
        } else if (x > a[m]){
            L = m + 1;
        } else {
            R = m - 1;
        }
    } while (L <= R);
    return -1;
}

STEP 1: Generate a code graph

from GPA.graphs import CodePropGraph
g = CodePropGraph(name="example")

STEP 2: Add nodes to the graph

g.addnodes("N1", "int L=0; int R=length-1;")
g.addnodes("N2", "int m=(L+R)/2;")
g.addnodes("N3", "if(x==a[n])")
g.addnodes("N4", "return m;")
g.addnodes("N5", "if (x<a[m])")
g.addnodes("N6", "L=m+1;")
g.addnodes("N7", "R=m-1;")
g.addnodes("N8", "while(L<=R)")
g.addnodes("N9", "return -1;")
print(g.nodes[0], g.num_nodes)

You will see the graph contains 9 nodes:

<GPA.modules.basic.CodeNode object at 0x7f7df8583dc0> 9

STEP 3: Add edges to the graph

g.addedges("N1", "N2")
g.addedges("N2", "N3")
g.addedges("N3", "N4", edge_attr="True")
g.addedges("N3", "N5", edge_attr="False")
g.addedges("N5", "N6", edge_attr="True")
g.addedges("N5", "N7", edge_attr="False")
g.addedges("N6", "N8")
g.addedges("N7", "N8")
g.addedges("N8", "N2", edge_attr="True")
g.addedges("N8", "N9", edge_attr="False")
print(g.edges[0], g.num_edges)

You will see the graph contains 10 edges.

<GPA.modules.basic.CodeEdge object at 0x7f7df857d5e0> 10

Example 1: control flow graph walk

  • set the initial node.
start_node = "N1"
  • start walk through the graph. Note that all the possible nodes will be marked.
start_node = g.nextstep(start_node, direction='forward')
dot = g.draw(g.nodes2mask(start_node))
display(dot)

Then you can see the next node(s).

Keep running the above code in tests/test_cpg.ipynb to see the execution paths.

Results-of-Graph-Walk

Example 2: code slicing

code slicing is a way to analyze the local block of specific statements.

In this example, we show the 2-hop neighbors of control flow relationship with statement if (x < a[m]).

slice_mask = [1 if 4 == idx else 0 for idx in range(g.num_nodes)]
mask = g.slicing(slice_mask, neighbor=2, direction='bidirect')
dot = g.draw(mask)
display(dot)

We can see all the relevant statement are marked.

We can choose to use forward, backward, or bidirect slicing methods.

Results-of-Code-Slicing

License

This package is under GNU General Public License v3 (GPLv3).

Notes:

If you find any issues on the package or have any ideas to share, please contact us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

GPA-0.1.0-py2.py3-none-any.whl (17.1 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page