Skip to main content

Generate CPG for multiple languages for use with joern

Project description

CPG Generator

 ██████╗██████╗  ██████╗
██╔════╝██╔══██╗██╔════╝
██║     ██████╔╝██║  ███╗
██║     ██╔═══╝ ██║   ██║
╚██████╗██║     ╚██████╔╝
 ╚═════╝╚═╝      ╚═════╝

CPG Generator is a python cli tool to generate Code Property Graph for multiple languages. The generated CPG can be directly imported to Joern or uploaded to Qwiet.AI for analysis.

Installation

cpggen is available as a PyPI package or as a container image.

pip install cpggen

Bundled container image

docker pull ghcr.io/appthreat/cpggen
# podman pull ghcr.io/appthreat/cpggen

Or use the nightly to always get the latest joern and tools.

docker pull ghcr.io/appthreat/cpggen:nightly
# podman pull ghcr.io/appthreat/cpggen:nightly

Single executable binaries

Download the executable binary for your operating system from the releases page. These binary bundle the following:

  • cpggen with Python 3.10
  • cdxgen with Node.js 18
  • cdxgen binary plugins
curl -LO https://github.com/AppThreat/cpggen/releases/download/v0.8.0/cpggen-linux-amd64
chmod +x cpggen-linux-amd64
./cpggen-linux-amd64 --help

On Windows,

curl -LO https://github.com/appthreat/cpggen/releases/download/v0.8.0/cpggen.exe
.\cpggen.exe --help

OCI Artifacts via ORAS cli

Use ORAS cli to download the cpggen binary with Python and Node.js preinstalled.

oras pull ghcr.io/appthreat/cpggen-bin:v1
chmod +x cpggen-linux-amd64
./cpggen-linux-amd64 --help

Usage

To auto detect the language from the current directory and generate CPG.

cpggen

To specify input and output directory.

cpggen -i <src directory> -o <CPG directory or file name>

You can even pass a git url as source

cpggen -i https://github.com/HooliCorp/vulnerable-aws-koa-app -o /tmp/cpg

To specify language type.

cpggen -i <src directory> -o <CPG directory or file name> -l java

# Comma separated values are accepted for multiple languages
cpggen -i <src directory> -o <CPG directory or file name> -l java,js,python

Container based invocation

docker run --rm -it -v /tmp:/tmp -v $(pwd):/app:rw --cpus=4 --memory=16g -t ghcr.io/appthreat/cpggen cpggen -i <src directory> -o <CPG directory or file name>

Export graphs

By passing --export, cpggen can export the various graphs to many formats using joern-export

Example to export all graphs in dot format

cpggen -i ~/work/sandbox/crAPI -o ~/work/sandbox/crAPI/cpg_out --build --export --export-out-dir ~/work/sandbox/crAPI/export_out

To export pdg in neo4jcsv format

cpggen -i ~/work/sandbox/crAPI -o ~/work/sandbox/crAPI/cpg_out --build --export --export-out-dir ~/work/sandbox/crAPI/export_out --export-repr pdg --export-format neo4jcsv

Artifacts produced

Upon successful completion, cpggen would produce the following artifacts in the directory specified under out_dir

  • {name}-{lang}-cpg.bin.zip - Code Property Graph for the given language type
  • {name}-{lang}-cpg.bom.xml - SBoM in CycloneDX XML format
  • {name}-{lang}-cpg.bom.json - SBoM in CycloneDX json format
  • {name}-{lang}-cpg.manifest.json - A json file listing the generated artifacts and the invocation commands

Server mode

cpggen can run in server mode.

cpggen --server

You can invoke the endpoint /cpg to generate CPG.

curl "http://127.0.0.1:7072/cpg?src=/Volumes/Work/sandbox/vulnerable-aws-koa-app&out_dir=/tmp/cpg_out&lang=js"
curl "http://127.0.0.1:7072/cpg?url=https://github.com/HooliCorp/vulnerable-aws-koa-app&out_dir=/tmp/cpg_out&lang=js"

Languages supported

Language Requires build
C No
C++ No
Java No (*)
Scala Yes
Jsp Yes
Jar/War No
JavaScript No
TypeScript No
Kotlin No (*)
Php No
Python No
C# / dotnet Yes
Go Yes

(*) - Precision could be improved with dependencies

Environment variables

Name Purpose
JOERN_HOME Joern installation directory
CPGGEN_HOST cpggen server host. Default 127.0.0.1
CPGGEN_PORT cpggen server port. Default 7072
CPGGEN_CONTAINER_CPU CPU units to use in container execution mode. Default computed
CPGGEN_CONTAINER_MEMORY Memory units to use in container execution mode. Default computed
CPGGEN_MEMORY Heap memory to use for frontends. Default computed
AT_DEBUG_MODE Set to debug to enable debug logging
CPG_EXPORT Set to true to export CPG graphs in dot format
CPG_EXPORT_REPR Graph to export. Default all
CPG_EXPORT_FORMAT Export format. Default dot

GitHub actions

Use the marketplace action to generate CPGs using GitHub actions. Optionally, the upload the generated CPGs as build artifacts use the below step.

- name: Upload cpg
  uses: actions/upload-artifact@v1.0.0
  with:
    name: cpg
    path: cpg_out

License

Apache-2.0

Developing / Contributing

git clone git@github.com:AppThreat/cpggen.git
cd cpggen

python -m pip install --upgrade pip
python -m pip install poetry
# Add poetry to the PATH environment variable
poetry install

poetry run cpggen -i <src directory>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cpggen-0.8.0.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

cpggen-0.8.0-py3-none-any.whl (18.7 kB view details)

Uploaded Python 3

File details

Details for the file cpggen-0.8.0.tar.gz.

File metadata

  • Download URL: cpggen-0.8.0.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.10 Linux/5.15.0-1035-azure

File hashes

Hashes for cpggen-0.8.0.tar.gz
Algorithm Hash digest
SHA256 a5e59ca29df3528881356257938a6308f1879e80bd7f32719d9f6a469fc712c7
MD5 c071945cdaf46e765bb023172ad86dc1
BLAKE2b-256 babd9c5462c0fd4d789acfb94e24db131e3de9b61cf0aa234e4240fbdc23421e

See more details on using hashes here.

File details

Details for the file cpggen-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: cpggen-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 18.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.10 Linux/5.15.0-1035-azure

File hashes

Hashes for cpggen-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2c2432b3deb031d92c3fa93086d3a6a6d8f7d6f15072b8af16906c48584ace04
MD5 6531f336e7a75201b198c428d8671673
BLAKE2b-256 e50d2c1a40e0abb02cd90b3f8c241132e8c331495238c9d79749a84754a5da51

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page