Skip to main content

Generate CPG for multiple languages for use with joern

Project description

CPG Generator

 ██████╗██████╗  ██████╗
██╔════╝██╔══██╗██╔════╝
██║     ██████╔╝██║  ███╗
██║     ██╔═══╝ ██║   ██║
╚██████╗██║     ╚██████╔╝
 ╚═════╝╚═╝      ╚═════╝

CPG Generator is a python cli tool to generate Code Property Graph for multiple languages. The generated CPG can be directly imported to Joern or uploaded to Qwiet.AI for analysis.

Installation

cpggen is available as a PyPI package or as a container image.

pip install cpggen

Bundled container image

docker pull ghcr.io/appthreat/cpggen
# podman pull ghcr.io/appthreat/cpggen

Or use the nightly to always get the latest joern and tools.

docker pull ghcr.io/appthreat/cpggen:nightly
# podman pull ghcr.io/appthreat/cpggen:nightly

Single executable binaries

Download the executable binary for your operating system from the releases page. These binary bundle the following:

  • cpggen with Python 3.10
  • cdxgen with Node.js 18
  • cdxgen binary plugins
curl -LO https://github.com/AppThreat/cpggen/releases/download/v0.8.1/cpggen-linux-amd64
chmod +x cpggen-linux-amd64
./cpggen-linux-amd64 --help

On Windows,

curl -LO https://github.com/appthreat/cpggen/releases/download/v0.8.1/cpggen.exe
.\cpggen.exe --help

OCI Artifacts via ORAS cli

Use ORAS cli to download the cpggen binary with Python and Node.js preinstalled.

oras pull ghcr.io/appthreat/cpggen-bin:v1
chmod +x cpggen-linux-amd64
./cpggen-linux-amd64 --help

Usage

To auto detect the language from the current directory and generate CPG.

cpggen

To specify input and output directory.

cpggen -i <src directory> -o <CPG directory or file name>

You can even pass a git url as source

cpggen -i https://github.com/HooliCorp/vulnerable-aws-koa-app -o /tmp/cpg

To specify language type.

cpggen -i <src directory> -o <CPG directory or file name> -l java

# Comma separated values are accepted for multiple languages
cpggen -i <src directory> -o <CPG directory or file name> -l java,js,python

Container based invocation

docker run --rm -it -v /tmp:/tmp -v $(pwd):/app:rw --cpus=4 --memory=16g -t ghcr.io/appthreat/cpggen cpggen -i <src directory> -o <CPG directory or file name>

Export graphs

By passing --export, cpggen can export the various graphs to many formats using joern-export

Example to export all graphs in dot format

cpggen -i ~/work/sandbox/crAPI -o ~/work/sandbox/crAPI/cpg_out --build --export --export-out-dir ~/work/sandbox/crAPI/export_out

To export pdg in neo4jcsv format

cpggen -i ~/work/sandbox/crAPI -o ~/work/sandbox/crAPI/cpg_out --build --export --export-out-dir ~/work/sandbox/crAPI/export_out --export-repr pdg --export-format neo4jcsv

Artifacts produced

Upon successful completion, cpggen would produce the following artifacts in the directory specified under out_dir

  • {name}-{lang}-cpg.bin.zip - Code Property Graph for the given language type
  • {name}-{lang}-cpg.bom.xml - SBoM in CycloneDX XML format
  • {name}-{lang}-cpg.bom.json - SBoM in CycloneDX json format
  • {name}-{lang}-cpg.manifest.json - A json file listing the generated artifacts and the invocation commands

Server mode

cpggen can run in server mode.

cpggen --server

You can invoke the endpoint /cpg to generate CPG.

curl "http://127.0.0.1:7072/cpg?src=/Volumes/Work/sandbox/vulnerable-aws-koa-app&out_dir=/tmp/cpg_out&lang=js"
curl "http://127.0.0.1:7072/cpg?url=https://github.com/HooliCorp/vulnerable-aws-koa-app&out_dir=/tmp/cpg_out&lang=js"

Languages supported

Language Requires build
C No
C++ No
Java No (*)
Scala Yes
Jsp Yes
Jar/War No
JavaScript No
TypeScript No
Kotlin No (*)
Php No
Python No
C# / dotnet Yes
Go Yes

(*) - Precision could be improved with dependencies

Environment variables

Name Purpose
JOERN_HOME Joern installation directory
CPGGEN_HOST cpggen server host. Default 127.0.0.1
CPGGEN_PORT cpggen server port. Default 7072
CPGGEN_CONTAINER_CPU CPU units to use in container execution mode. Default computed
CPGGEN_CONTAINER_MEMORY Memory units to use in container execution mode. Default computed
CPGGEN_MEMORY Heap memory to use for frontends. Default computed
AT_DEBUG_MODE Set to debug to enable debug logging
CPG_EXPORT Set to true to export CPG graphs in dot format
CPG_EXPORT_REPR Graph to export. Default all
CPG_EXPORT_FORMAT Export format. Default dot
SHIFTLEFT_ACCESS_TOKEN Set to automatically submit the CPG for analysis by Qwiet AI

GitHub actions

Use the marketplace action to generate CPGs using GitHub actions. Optionally, the upload the generated CPGs as build artifacts use the below step.

- name: Upload cpg
  uses: actions/upload-artifact@v1.0.0
  with:
    name: cpg
    path: cpg_out

License

Apache-2.0

Developing / Contributing

git clone git@github.com:AppThreat/cpggen.git
cd cpggen

python -m pip install --upgrade pip
python -m pip install poetry
# Add poetry to the PATH environment variable
poetry install

poetry run cpggen -i <src directory>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cpggen-0.8.1.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

cpggen-0.8.1-py3-none-any.whl (19.8 kB view details)

Uploaded Python 3

File details

Details for the file cpggen-0.8.1.tar.gz.

File metadata

  • Download URL: cpggen-0.8.1.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.10 Linux/5.15.0-1035-azure

File hashes

Hashes for cpggen-0.8.1.tar.gz
Algorithm Hash digest
SHA256 a91a4ca368e7b59322765ce074bf923d365f60ab41974c188fd82afbdf67b8e1
MD5 a07d994cbee0ece696e0885b8eae1a79
BLAKE2b-256 a291d663a31fb16bd3ccf96545dfa5eb97a2976506d15344251b129cb7b75b74

See more details on using hashes here.

File details

Details for the file cpggen-0.8.1-py3-none-any.whl.

File metadata

  • Download URL: cpggen-0.8.1-py3-none-any.whl
  • Upload date:
  • Size: 19.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.10 Linux/5.15.0-1035-azure

File hashes

Hashes for cpggen-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bad952315ba082aee1f8a8d64bea2f36f3fedc609e30c239f44b4e9ade5c976e
MD5 90cf6010b372410a524790a054f0adc4
BLAKE2b-256 7d89ec5ea4ea54346a10c463d3c5890cde73e24a2ef54d921194cf31cd6320ad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page