Skip to main content

Generate CPG for multiple languages for use with joern

Project description

CPG Generator

 ██████╗██████╗  ██████╗
██╔════╝██╔══██╗██╔════╝
██║     ██████╔╝██║  ███╗
██║     ██╔═══╝ ██║   ██║
╚██████╗██║     ╚██████╔╝
 ╚═════╝╚═╝      ╚═════╝

CPG Generator is a python cli tool to generate Code Property Graph for multiple languages. The generated CPG can be directly imported to Joern or uploaded to Qwiet.AI for analysis.

Installation

cpggen is available as a PyPI package or as a container image.

pip install cpggen

Bundled container image

docker pull ghcr.io/appthreat/cpggen
# podman pull ghcr.io/appthreat/cpggen

Or use the nightly to always get the latest joern and tools.

docker pull ghcr.io/appthreat/cpggen:nightly
# podman pull ghcr.io/appthreat/cpggen:nightly

Single executable binaries

Download the executable binary for your operating system from the releases page. These binary bundle the following:

  • cpggen with Python 3.10
  • cdxgen with Node.js 18
  • cdxgen binary plugins
curl -LO https://github.com/AppThreat/cpggen/releases/download/v0.8.1/cpggen-linux-amd64
chmod +x cpggen-linux-amd64
./cpggen-linux-amd64 --help

On Windows,

curl -LO https://github.com/appthreat/cpggen/releases/download/v0.8.1/cpggen.exe
.\cpggen.exe --help

OCI Artifacts via ORAS cli

Use ORAS cli to download the cpggen binary with Python and Node.js preinstalled.

oras pull ghcr.io/appthreat/cpggen-bin:v1
chmod +x cpggen-linux-amd64
./cpggen-linux-amd64 --help

Usage

To auto detect the language from the current directory and generate CPG.

cpggen

To specify input and output directory.

cpggen -i <src directory> -o <CPG directory or file name>

You can even pass a git url as source

cpggen -i https://github.com/HooliCorp/vulnerable-aws-koa-app -o /tmp/cpg

To specify language type.

cpggen -i <src directory> -o <CPG directory or file name> -l java

# Comma separated values are accepted for multiple languages
cpggen -i <src directory> -o <CPG directory or file name> -l java,js,python

Container based invocation

docker run --rm -it -v /tmp:/tmp -v $(pwd):/app:rw --cpus=4 --memory=16g -t ghcr.io/appthreat/cpggen cpggen -i <src directory> -o <CPG directory or file name>

Export graphs

By passing --export, cpggen can export the various graphs to many formats using joern-export

Example to export all graphs in dot format

cpggen -i ~/work/sandbox/crAPI -o ~/work/sandbox/crAPI/cpg_out --build --export --export-out-dir ~/work/sandbox/crAPI/export_out

To export pdg in neo4jcsv format

cpggen -i ~/work/sandbox/crAPI -o ~/work/sandbox/crAPI/cpg_out --build --export --export-out-dir ~/work/sandbox/crAPI/export_out --export-repr pdg --export-format neo4jcsv

Artifacts produced

Upon successful completion, cpggen would produce the following artifacts in the directory specified under out_dir

  • {name}-{lang}-cpg.bin.zip - Code Property Graph for the given language type
  • {name}-{lang}-cpg.bom.xml - SBoM in CycloneDX XML format
  • {name}-{lang}-cpg.bom.json - SBoM in CycloneDX json format
  • {name}-{lang}-cpg.manifest.json - A json file listing the generated artifacts and the invocation commands

Server mode

cpggen can run in server mode.

cpggen --server

You can invoke the endpoint /cpg to generate CPG.

curl "http://127.0.0.1:7072/cpg?src=/Volumes/Work/sandbox/vulnerable-aws-koa-app&out_dir=/tmp/cpg_out&lang=js"
curl "http://127.0.0.1:7072/cpg?url=https://github.com/HooliCorp/vulnerable-aws-koa-app&out_dir=/tmp/cpg_out&lang=js"

Languages supported

Language Requires build
C No
C++ No
Java No (*)
Scala Yes
Jsp Yes
Jar/War No
JavaScript No
TypeScript No
Kotlin No (*)
Php No
Python No
C# / dotnet Yes
Go Yes

(*) - Precision could be improved with dependencies

Environment variables

Name Purpose
JOERN_HOME Joern installation directory
CPGGEN_HOST cpggen server host. Default 127.0.0.1
CPGGEN_PORT cpggen server port. Default 7072
CPGGEN_CONTAINER_CPU CPU units to use in container execution mode. Default computed
CPGGEN_CONTAINER_MEMORY Memory units to use in container execution mode. Default computed
CPGGEN_MEMORY Heap memory to use for frontends. Default computed
AT_DEBUG_MODE Set to debug to enable debug logging
CPG_EXPORT Set to true to export CPG graphs in dot format
CPG_EXPORT_REPR Graph to export. Default all
CPG_EXPORT_FORMAT Export format. Default dot
SHIFTLEFT_ACCESS_TOKEN Set to automatically submit the CPG for analysis by Qwiet AI

GitHub actions

Use the marketplace action to generate CPGs using GitHub actions. Optionally, the upload the generated CPGs as build artifacts use the below step.

- name: Upload cpg
  uses: actions/upload-artifact@v1.0.0
  with:
    name: cpg
    path: cpg_out

License

Apache-2.0

Developing / Contributing

git clone git@github.com:AppThreat/cpggen.git
cd cpggen

python -m pip install --upgrade pip
python -m pip install poetry
# Add poetry to the PATH environment variable
poetry install

poetry run cpggen -i <src directory>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cpggen-0.8.1.tar.gz (19.6 kB view hashes)

Uploaded Source

Built Distribution

cpggen-0.8.1-py3-none-any.whl (19.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page