Repository Scanner - Version Control System - Scanner
Project description
Repository Scanner Version Control System Scanner (RESC-VCS-SCANNER)
[!NOTE]
This component is part of Repository Scanner - resc
Table of contents
About the component
The RESC-VCS-Scanner component uses the Gitleaks binary file to scan the source code for secrets.
Getting started
These instructions will help you to get a copy of the project up and running on your local machine for development and testing purposes.
Prerequisites
Run locally from source
Preview
Prerequisites:
- RabbitMQ and RESC web service must be up and running locally.
If you have already deployed RESC through helm in Kubernetes, then rabbitmq and resc webservice are already running for you. - Install Gitleaks v8.18.0 on your system.
- Download the rule config toml file to
/tmp/temp_resc_rule.toml
location by running below command from a Git Bash terminal. - Send some repositories to 'repositories' topics of RabbitMQ server by referring the README of RESC-VCS-SCRAPER component.
curl https://raw.githubusercontent.com/zricethezav/gitleaks/master/config/gitleaks.toml > /tmp/temp_resc_rule.toml
Clone the repository, open the Git Bash terminal from /components/resc-vcs-scanner folder, and run below commands.
1. Create virtual environment:
cd components/resc-vcs-scanner
pip install virtualenv
virtualenv venv
source venv/Scripts/activate
2. Install resc_vcs_scanner package:
pip install -e .
3. Set below environment variables:
export RESC_RABBITMQ_SERVICE_HOST=127.0.0.1 # The hostname/IP address of the rabbitmq server
export RESC_RABBITMQ_SERVICE_PORT_AMQP=30902 # The amqp port of the rabbitmq server
export RABBITMQ_DEFAULT_VHOST=resc-rabbitmq # The virtual host name of the rabbitmq server
export RABBITMQ_USERNAME=queue_user # The username used to connect to the rabbitmq projects and repositories topics
export RABBITMQ_PASSWORD="" # The password used to connect to the rabbitmq projects and repositories topics can be found for the value of queues_password field in /deployment/kubernetes/example-values.yaml file
export RABBITMQ_QUEUE=repositories # The name of the queue from which secret scanner will read repositories
export RESC_API_NO_AUTH_SERVICE_HOST=127.0.0.1 # The hostname/IP address where RESC web service is running
export RESC_API_NO_AUTH_SERVICE_PORT=30900 # The port number where RESC web service is running
export VCS_INSTANCES_FILE_PATH="" # The absolute path to vcs_instances_config.json file containing the vcs instances definitions
export GITHUB_PUBLIC_USERNAME="" # Your GitHub username
export GITHUB_PUBLIC_TOKEN="" # Your GitHub personal access token
export GITLEAKS_PATH="" # The absolute path to gitleaks binary executable
You need to replace the following values with your custom values: RABBITMQ_PASSWORD, VCS_INSTANCES_FILE_PATH, GITHUB_PUBLIC_USERNAME, GITHUB_PUBLIC_TOKEN and GITLEAKS_PATH.
Structure of vcs instances config json
The vcs_instances_config.json file must have the following format: Note: You can add multiple vcs instances.
Preview
Example:
{
"vcs_instance_1": {
"name": "GITHUB_PUBLIC",
"scope": ["kubernetes"],
"exceptions": [],
"provider_type": "GITHUB_PUBLIC",
"hostname": "github.com",
"port": "443",
"scheme": "https",
"username": "GITHUB_PUBLIC_USERNAME",
"token": "GITHUB_PUBLIC_TOKEN",
"organization": ""
}
}
-
scope: List of GitHub accounts you want to scan. For example, lets'say you want to scan all the repositories for the following GitHub accounts. https://github.com/kubernetes
https://github.com/dockerThen you need to add those accounts to scope like: ["kubernetes", "docker"]. All the repositories from those accounts will be scanned.
-
exceptions (optional): If you want to exclude any account from scan, then add it to exceptions. Default is empty exception.
The output messages of collect_projects
command has the following format:
{
"project_key": "kubernetes",
"vcs_instance_name": "GITHUB_PUBLIC",
}
4. Run the secret scan task:
This task reads the repositories from a RabbitMQ channel called 'repositories', runs scan using Gitleaks and saves the findings' metadata to database.
This can be done via the following command:
celery -A vcs_scanner.secret_scanners.celery_worker worker --loglevel=INFO -E -Q repositories --concurrency=1 --prefetch-multiplier=1
Run locally using docker
Preview
Run the RESC VCS Scanner docker image locally by running the following commands:- Pull the docker image from registry:
docker pull rescabnamro/resc-vcs-scanner:latest
- Alternatively, build the docker image locally by running:
docker build -t rescabnamro/resc-vcs-scanner:latest .
- Run the vcs-scanner by using below command:
docker run -v <path to vcs_instances_config.json in your local system>:/tmp/vcs_instances_config.json -e RESC_RABBITMQ_SERVICE_HOST="host.docker.internal" -e RESC_RABBITMQ_SERVICE_PORT_AMQP=30902 -e RABBITMQ_DEFAULT_VHOST=resc-rabbitmq -e RABBITMQ_USERNAME=queue_user -e RABBITMQ_PASSWORD="<the password of queue_user>" -e RABBITMQ_QUEUE="repositories" -e RESC_API_NO_AUTH_SERVICE_HOST="host.docker.internal" -e RESC_API_NO_AUTH_SERVICE_PORT=30900 -e VCS_INSTANCES_FILE_PATH="/tmp/vcs_instances_config.json" -e GITHUB_PUBLIC_USERNAME="<your github username>" -e GITHUB_PUBLIC_TOKEN="<your github personal access token>" -e GITLEAKS_PATH="/vcs_scanner/gitleaks_config/seco-gitleaks-linux-amd64" --name resc-vcs-scanner rescabnamro/resc-vcs-scanner:latest celery -A vcs_scanner.secret_scanners.celery_worker worker --loglevel=INFO -E -Q repositories --concurrency=1 --prefetch-multiplier=1
To create vcs_instances_config.json file please refer to: Structure of vcs_instances_config.json
Run locally as a CLI tool (Still in development)
Preview
It is also possible to run the component as a CLI tool to scan VCS repositories.
1. Create virtual environment:
cd components/resc-vcs-scanner
pip install virtualenv
virtualenv venv
source venv/bin/activate
2. Install resc_vcs_scanner package:
pip install -e .
3. Run CLI scanner:
The CLI has 3 modes of operation, please make use of the --help argument to see all the options for the modes:
-
Scanning a non-git directory:
secret_scanner dir --help secret_scanner dir --gitleaks-rules-path=<path to gitleaks toml rule> --gitleaks-path=<path to gitleaks binary> --ignored-blocker-path=<path to resc-ignore.dsv file> --dir=<directory to scan>
-
Scanning an already cloned git repository:
secret_scanner repo local --help secret_scanner repo local --gitleaks-rules-path=<path to gitleaks toml rule> --gitleaks-path=<path to gitleaks binary> --ignored-blocker-path=<path to resc-ignore.dsv file> --dir=<directory of repository to scan>
-
Scanning a remote git repository:
secret_scanner repo remote --help secret_scanner repo remote --gitleaks-rules-path=<path to gitleaks toml rule> --gitleaks-path=<path to gitleaks binary> --ignored-blocker-path=<path to resc-ignore.dsv file> --repo-url=<url of repository to scan>
Most CLI arguments can also be provided by setting the corresponding environment variable. Please see the --help options on the arguments that can be provided using environment variables, and the expected environment variable names. These will always be prefixed with RESC_
Example: the argument --gitleaks-path can be provided using the environment variable RESC_GITLEAKS_PATH
Ignoring findings
Preview
It is possible to ignore some blocker findings (e.g. false positive) by providing
a resc-ignore.dsv
file. The bockers will be downgraded to a warning level and marked as ignored. Such file has the following structure:
# This is a comment
finding_path|finding_rule|finding_line_number|expiration_date
finding_path_2|finding_rule_2|finding_line_number_2
finding_path
contains the path to the file with the blocking finding.finding_rule
contains the name of the blocking rule.finding_line_number
contains the line number of the finding.expiration_date
is optional, contains the date in ISO 8601 format until which this ignore rule should be considered valid.
For example, if we want to ignore the finding in file /etc/passwd
for rule root_value_found
on line 1
until April 1st 2024 at 23:59 the following line should be used.
/etc/passwd|root_value_found|1|2024-04-01T23:59:00
To ignore this finding ad vitam aeternam:
/etc/passwd|root_value_found|1
Testing
Run below commands to make sure that the unit tests are running and that the code matches quality standards:
Note: To run these tests you need to install tox. This can be done on Linux and Windows with Git Bash.
pip install tox # install tox locally
tox -v -e sort # Run this command to validate the import sorting
tox -v -e lint # Run this command to lint the code according to this repository's standard
tox -v -e pytest # Run this command to run the unit tests
tox -v # Run this command to run all of the above tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file resc_vcs_scanner-3.6.1.tar.gz
.
File metadata
- Download URL: resc_vcs_scanner-3.6.1.tar.gz
- Upload date:
- Size: 28.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9b1c774c0ab0ddc42bcfbc32c804d729b8bf0c9d1c1b851b776a5e9803549c6 |
|
MD5 | 2b4d63251f98690d0a20fdef10fab799 |
|
BLAKE2b-256 | afe9fa7675f249936ad304265166753d4f52890417973923675323a046cc382d |
File details
Details for the file resc_vcs_scanner-3.6.1-py3-none-any.whl
.
File metadata
- Download URL: resc_vcs_scanner-3.6.1-py3-none-any.whl
- Upload date:
- Size: 37.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | df0849d409a340becf23430b6470e3541809eadbbd3dfb28535d5b9e87e798ce |
|
MD5 | 62385b0cc020103501af363f6fc94d75 |
|
BLAKE2b-256 | e97c7566c91d41335d7a45d812b50a4f487c33fc0d27342f79251867fb88fd97 |