Generate CPG for multiple languages for use with joern
Project description
CPG Generator
██████╗██████╗ ██████╗
██╔════╝██╔══██╗██╔════╝
██║ ██████╔╝██║ ███╗
██║ ██╔═══╝ ██║ ██║
╚██████╗██║ ╚██████╔╝
╚═════╝╚═╝ ╚═════╝
CPG Generator is a python cli tool to generate Code Property Graph for multiple languages. The generated CPG can be directly imported to Joern or uploaded to Qwiet.AI for analysis.
Pre-requisites
- JDK 11 or above
- Python 3.10
- Docker or podman (Windows, Linux or Mac) or
- Joern natively installed (Linux only)
Installation
cpggen is available as a single executable binary, PyPI package or as a container image.
Single executable binaries
Download the executable binary for your operating system from the releases page. These binary bundle the following:
- Joern with all the CPG frontends
- cpggen with Python 3.10
- cdxgen with Node.js 18 - Generates SBoM
curl -LO https://github.com/AppThreat/cpggen/releases/download/v1.0.0/cpggen-linux-amd64
chmod +x cpggen-linux-amd64
./cpggen-linux-amd64 --help
On Windows,
curl -LO https://github.com/appthreat/cpggen/releases/download/v1.0.0/cpggen.exe
.\cpggen.exe --help
NOTE: On Windows, antivirus and antimalware could prevent this single executable from functioning properly. Depending on the system, administrative privileges might also be required. Use container-based execution as a fallback.
OCI Artifacts via ORAS cli
Use ORAS cli to download the cpggen binary.
oras pull ghcr.io/appthreat/cpggen-bin:v1
chmod +x cpggen-linux-amd64
./cpggen-linux-amd64 --help
On Windows
Invoke-WebRequest -Uri https://github.com/oras-project/oras/releases/download/v1.0.0/oras_1.0.0_windows_amd64.zip -UseBasicParsing -OutFile oras_1.0.0_windows_amd64.zip
Expand-Archive -Path oras_1.0.0_windows_amd64.zip -DestinationPath .
oras.exe pull ghcr.io/appthreat/cpggen-windows-bin:v1
PyPI package
This would install just the python cli tool without any CPG language frontends. Joern must be installed separately to make the cli work.
pip install cpggen
Bundled container image
docker pull ghcr.io/appthreat/cpggen
# podman pull ghcr.io/appthreat/cpggen
Almalinux 9 requires the CPU to support SSE4.2. For kvm64 VM use the Almalinux 8 version instead.
docker pull ghcr.io/appthreat/cpggen-alma8
# podman pull ghcr.io/appthreat/cpggen-alma8
Or use the nightly to always get the latest joern and tools.
docker pull ghcr.io/appthreat/cpggen:nightly
# podman pull ghcr.io/appthreat/cpggen:nightly
Usage
To auto detect the language from the current directory and generate CPG.
cpggen
To specify input and output directory.
cpggen -i <src directory> -o <CPG directory or file name>
You can even pass a git url as source
cpggen -i https://github.com/HooliCorp/vulnerable-aws-koa-app -o /tmp/cpg
To specify language type.
cpggen -i <src directory> -o <CPG directory or file name> -l java
# Comma separated values are accepted for multiple languages
cpggen -i <src directory> -o <CPG directory or file name> -l java,js,python
Container based invocation
docker run --rm -it -v /tmp:/tmp -v $(pwd):/app:rw --cpus=4 --memory=16g -t ghcr.io/appthreat/cpggen cpggen -i <src directory> -o <CPG directory or file name>
Export graphs
By passing --export
, cpggen can export the various graphs to many formats using joern-export
Example to export cpg14
graphs in dot
format
cpggen -i ~/work/sandbox/crAPI -o ~/work/sandbox/crAPI/cpg_out --build --export --export-out-dir ~/work/sandbox/crAPI/cpg_export
To export pdg
in neo4jcsv
format
cpggen -i ~/work/sandbox/crAPI -o ~/work/sandbox/crAPI/cpg_out --build --export --export-out-dir ~/work/sandbox/crAPI/cpg_export --export-repr pdg --export-format neo4jcsv
Artifacts produced
Upon successful completion, cpggen would produce the following artifacts in the directory specified under out_dir
- {name}-{lang}-cpg.bin.zip - Code Property Graph for the given language type
- {name}-{lang}-cpg.bom.xml - SBoM in CycloneDX XML format
- {name}-{lang}-cpg.bom.json - SBoM in CycloneDX json format
- {name}-{lang}-cpg.manifest.json - A json file listing the generated artifacts and the invocation commands
Server mode
cpggen can run in server mode.
cpggen --server
You can invoke the endpoint /cpg
to generate CPG.
curl "http://127.0.0.1:7072/cpg?src=/Volumes/Work/sandbox/vulnerable-aws-koa-app&out_dir=/tmp/cpg_out&lang=js"
curl "http://127.0.0.1:7072/cpg?url=https://github.com/HooliCorp/vulnerable-aws-koa-app&out_dir=/tmp/cpg_out&lang=js"
Languages supported
Language | Requires build |
---|---|
C | No |
C++ | No |
Java | No (*) |
Scala | Yes |
Jsp | Yes |
Jar/War | No |
JavaScript | No |
TypeScript | No |
Kotlin | No (*) |
Php | No |
Python | No |
C# / dotnet | Yes |
Go | Yes |
(*) - Precision could be improved with dependencies
Environment variables
Name | Purpose |
---|---|
JOERN_HOME | Joern installation directory |
CPGGEN_HOST | cpggen server host. Default 127.0.0.1 |
CPGGEN_PORT | cpggen server port. Default 7072 |
CPGGEN_CONTAINER_CPU | CPU units to use in container execution mode. Default computed |
CPGGEN_CONTAINER_MEMORY | Memory units to use in container execution mode. Default computed |
CPGGEN_MEMORY | Heap memory to use for frontends. Default computed |
AT_DEBUG_MODE | Set to debug to enable debug logging |
CPG_EXPORT | Set to true to export CPG graphs in dot format |
CPG_EXPORT_REPR | Graph to export. Default all |
CPG_EXPORT_FORMAT | Export format. Default dot |
SHIFTLEFT_ACCESS_TOKEN | Set to automatically submit the CPG for analysis by Qwiet AI |
GitHub actions
Use the marketplace action to generate CPGs using GitHub actions. Optionally, the upload the generated CPGs as build artifacts use the below step.
- name: Upload cpg
uses: actions/upload-artifact@v1.0.0
with:
name: cpg
path: cpg_out
License
Apache-2.0
Developing / Contributing
git clone git@github.com:AppThreat/cpggen.git
cd cpggen
python -m pip install --upgrade pip
python -m pip install poetry
# Add poetry to the PATH environment variable
poetry install
poetry run cpggen -i <src directory>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.