Skip to main content

Program analysis tools built on tree-sitter.

Project description

tree-climber

Program analysis tools built on tree-sitter. Currently supports only C.

Try it out

Install from pip:

pip install tree_climber

or run from source:

# install deps
pip install -r requirements.txt
# run on a test program :)
python tree_climber tests/data/example.c --draw_ast --draw_cfg --draw_duc

For Fedora 36:

sudo dnf install graphviz-devel python3-tkinter

Feel free to open a PR with other platform-specific instructions.

See developers.md for developer setup instructions.

Table of contents

  1. Features
    1. Visualize AST
    2. Construct and visualize Control-flow graph (CFG)
    3. Monotonic dataflow analysis
    4. Construct and visualize Def-use chain (DUC)
    5. Construct and visualize Code Property Graph (CPG)
  2. Contribute

Features

Examples shown on tests/data/example.c.

int main()
{
    int x = 0;
    x = x + 1;
    if (x > 1) {
        x += 5;
    }
    else {
        x += 50;
    }
    x = x + 2;
    for (int i = 0; i < 10; i ++) {
        x --;
    }
    x = x + 3;
    while (x < 0) {
        x ++;
        x = x + 1;
    }
    x = x + 4;
    return x;
}

Visualize AST

Visualize AST without concrete tokens included in tree-sitter:

python main.py tests/data/example.c --draw_ast

Example: AST example

Construct and visualize Control-flow graph (CFG)

Convert tree-sitter AST to CFG for C programs. AST -> CFG algorithm is based on Joern, specifically CfgCreator.scala.

Visualize CFG:

python main.py tests/data/example.c --draw_cfg

Example: CFG example

Monotonic dataflow analysis

See dataflow_solver.py.

Construct and visualize Def-use chain (DUC)

Visualize DUC:

python main.py tests/data/example.c --draw_duc

Example: DUC example

Construct and visualize Code Property Graph (CPG)

CPG composes AST + CFG + DUC into one graph for combined analysis. Eventual goal is feature parity with Joern's usage in ML4SE.

Visualize CPG (edges are color-coded - black = AST, blue = CFG, red = DUC):

python main.py tests/data/example.c --draw_cpg

Example: CPG example

Contribute

Open issues on Github

Stress test (Jun 16 2022, outdated)

File parse.sh runs Joern and tree-sitter side by side to compare performance. Use joern-install.sh to install Joern first.

Benchmark 1: long stupid file - 10,000 lines of x++. Output 2022-06-15 19:44, v1.1.891 of Joern:

(tree-sitter-py38) benjis@AM:~/code/ts$ bash tests/vs-joern/parse.sh --joern tests/data/10000.c
executing /home/benjis/code/ts/tests/vs-joern/get_func_graph.scala with params=Map(filename -> tests/data/10000.c)
Compiling /home/benjis/code/ts/tests/vs-joern/get_func_graph.scala
creating workspace directory: /home/benjis/code/ts/workspace
Creating project `10000.c` for code at `tests/data/10000.c`
moving cpg.bin.zip to cpg.bin because it is already a database file
Creating working copy of CPG to be safe
Loading base CPG from: /home/benjis/code/ts/workspace/10000.c/cpg.bin.tmp
Code successfully imported. You can now query it using `cpg`.
For an overview of all imported code, type `workspace`.
Adding default overlays to base CPG
The graph has been modified. You may want to use the `save` command to persist changes to disk.  All changes will also be saved collectively on exit
script finished successfully
Some(())

real    0m14.143s
user    0m44.302s
sys     0m1.260s
(tree-sitter-py38) benjis@AM:~/code/ts$ bash tests/vs-joern/parse.sh --tree-sitter tests/data/10000.c

real    0m1.503s
user    0m1.385s
sys     0m0.111s

Benchmark 2: Linux kernel 5.18.4 Output 2022-06-15 21:51, v1.1.891 of Joern:

(tree-sitter-py38) benjis@AM:~/code/ts$ time python main.py linux-5.18.4 --cfg --file > output_treesitter.txt

real    9m47.570s
user    9m4.308s
sys     0m5.854s

(base) benjis@AM:~/code/ts$ time ./joern/joern-cli/joern --script ./tests/vs-joern/get_func_graph.scala --params filename=linux-5.18.4
executing /home/benjis/code/ts/tests/vs-joern/get_func_graph.scala with params=Map(filename -> linux-5.18.4)
Compiling /home/benjis/code/ts/tests/vs-joern/get_func_graph.scala
creating workspace directory: /home/benjis/code/ts/workspace
Creating project `linux-5.18.4` for code at `linux-5.18.4`
Killed
Error running shell command: List(/home/benjis/code/ts/joern/joern-cli/c2cpg.sh, linux-5.18.4, --output, /home/benjis/code/ts/workspace/linux-5.18.4/cpg.bin.zip)
Exception in thread "main" java.lang.AssertionError: script errored: 
	at io.joern.console.ScriptExecution.runScript(BridgeBase.scala:253)
	at io.joern.console.ScriptExecution.runScript$(BridgeBase.scala:229)
	at io.joern.joerncli.console.AmmoniteBridge$.runScript(AmmoniteBridge.scala:5)
	at io.joern.console.BridgeBase.runAmmonite(BridgeBase.scala:164)
	at io.joern.console.BridgeBase.runAmmonite$(BridgeBase.scala:146)
	at io.joern.joerncli.console.AmmoniteBridge$.runAmmonite(AmmoniteBridge.scala:5)
	at io.joern.joerncli.console.AmmoniteBridge$.delayedEndpoint$io$joern$joerncli$console$AmmoniteBridge$1(AmmoniteBridge.scala:7)
	at io.joern.joerncli.console.AmmoniteBridge$delayedInit$body.apply(AmmoniteBridge.scala:5)
	at scala.Function0.apply$mcV$sp(Function0.scala:39)
	at scala.Function0.apply$mcV$sp$(Function0.scala:39)
	at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
	at scala.App.$anonfun$main$1(App.scala:76)
	at scala.App.$anonfun$main$1$adapted(App.scala:76)
	at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
	at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
	at scala.App.main(App.scala:76)
	at scala.App.main$(App.scala:74)
	at io.joern.joerncli.console.AmmoniteBridge$.main(AmmoniteBridge.scala:5)
	at io.joern.joerncli.console.AmmoniteBridge.main(AmmoniteBridge.scala)
Caused by: io.joern.console.ConsoleException: Error creating project for input path: `linux-5.18.4`

real	499m56.583s
user	1193m14.686s
sys	7m26.020s

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tree_climber-0.0.1.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tree_climber-0.0.1-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file tree_climber-0.0.1.tar.gz.

File metadata

  • Download URL: tree_climber-0.0.1.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for tree_climber-0.0.1.tar.gz
Algorithm Hash digest
SHA256 a9ce6b0b69a19bf565edbf156598dd171ef275269394273b91d0340ca551772f
MD5 dc1d0d78100be03da6160a39d2551564
BLAKE2b-256 d41ec5f157a3e472e7473420d3390714cef443d04677f9ee0e043b0a587fdc7e

See more details on using hashes here.

File details

Details for the file tree_climber-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: tree_climber-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for tree_climber-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b14d28d0f3b2b627bce29c729380f23f08cd8b6d456554b29657657f87d2fed1
MD5 718c190c943a121b591f0c77e7ceb7a7
BLAKE2b-256 36b8ce54a6586b74408bb4d9eb091d2e9fa21afaf21ae7ec3f968cb46134889a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page