Python library for code analysis with CPG and Joern
Project description
Introduction
This project offers a high level python library to perform code analysis with CPG and Joern server. Several API methods including integration with NetworkX and PyTorch Geometric are offered to perform code analysis and research on complex code bases in a pythonic manner from cli and from notebooks.
pip install joern-lib
# To install the optional science pack, clone this repo and use poetry > 1.5 to install the science group
poetry install --with science # cpu
poetry install --with science-cu117 # cuda 11.7
poetry install --with science-cu118 # cuda 11.8
Notebook support
The repository includes docker compose configuration to interactively query the joern server with polynote notebooks.
Usage
Run joern server and polynote locally.
git clone https://github.com/appthreat/joern-lib.git
# Edit docker-compose.yml to set sources directory
docker compose up -d
# podman-compose up --build
Navigate to http://localhost:8192 for an interactive polynote notebook. You could open one of the sample notebooks from the contrib directory to learn about Joern server and this library.
Common steps
Refer to the API documentation for programmatic usage.
python -m asyncio
Execute single query
from joern_lib import client, workspace, utils
from joern_lib.detectors import common as cpg
connection = await client.get("http://localhost:9000", "http://localhost:7072", "admin", "admin")
# connection = await client.get("http://localhost:9000")
res = await client.q(connection, "val a=1");
# {'response': 'a: Int = 1\n'}
Execute bulk query
res = await client.bulk_query(connection, ["val a=1", "val b=2", "val c=a+b"]);
# [{'response': 'a: Int = 1\n'}, {'response': 'b: Int = 2\n'}, {'response': 'c: Int = 3\n'}]
Workspace
List workspaces
res = await workspace.ls(connection)
Get workspace path
res = await workspace.get_path(connection)
# /workspace (Response would be parsed)
Check if cpg exists
await workspace.cpg_exists(connection, "NodeGoat")
Import code for analysis
res = await workspace.import_code(connection, "/app", "NodeGoat")
# True
Import an existing CPG for analysis
res = await workspace.import_cpg(connection, "/app/sandbox/crAPI/cpg_out/crAPI-python-cpg.bin.zip", "crAPI-python")
Create a CPG with a remote cpggen server
res = await workspace.create_cpg(connection, "/app/sandbox/crAPI", out_dir="/app/sandbox/crAPI/cpg_out", languages="python", project_name="crAPI-python")
CPG core
List files
res = await cpg.list_files(connection)
# list of files
Print call tree
res = await cpg.get_call_tree(connection, "com.example.vulnspring.WebController.issue:java.lang.String(org.springframework.ui.Model,java.lang.String)")
utils.print_tree(res)
Java specific
from joern_lib.detectors import java
List http routes
await java.list_http_routes(connection)
JavaScript specific
from joern_lib.detectors import js
List http routes
await js.list_http_routes(connection)
Name of the variable containing express()
await js.get_express_appvar(connection)
List of require statements
await js.list_requires(connection)
List of import statements
await js.list_imports(connection)
List of NoSQL DB collection names
await js.list_nosql_collections(connection)
Get HTTP sources
await js.get_http_sources(connection)
await js.get_http_sinks(connection)
AWS
Requires TypeScript project
await js.list_aws_modules(connection)
Troubleshooting
No response from server
If Joern server stops responding after a while restart docker.
docker compose down
docker compose up -d
Websockets connection closed error
Adding asyncio.sleep(0) seems to fix such errors.
# Workaround to fix websockets.exceptions.ConnectionClosedError
await asyncio.sleep(0)
Alternatively, use the sync api.
pygraphviz refuses to install
pygraphviz/graphviz_wrap.c:2711:10: fatal error: graphviz/cgraph.h: No such file or directory
2711 | #include "graphviz/cgraph.h"
| ^~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command '/usr/bin/gcc' failed with exit code 1
Install graphviz-devel
or graphviz-dev
package for your OS. See here
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for joern_lib-0.12.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1674d22112c01eeec983948fc6dfe9d455d2831d0cfcd0a22970185a1635d041 |
|
MD5 | 5437ec8dbea05bc9b48aad050e87704a |
|
BLAKE2b-256 | a27a6d7de16bc91aeb82b86f4eba778d26f6763aa33fa2ffb6b1935684adbe06 |