Skip to main content

Automatically detect software supply chain smells and issues

Project description

dirty-waters

Dirty-waters automatically finds software supply chain issues in software projects by analyzing the available metadata of all dependencies, transitively.

Reference: Dirty-Waters: Detecting Software Supply Chain Smells, Technical report 2410.16049, arXiv, 2024.

By using dirty-waters, you identify the shady areas of your supply chain, which would be natural target for attackers to exploit.

Kinds of problems identified by dirty-waters:

  • Dependencies with no link to source code repositories (high severity)
  • Dependencies with no tag / commit sha for release, impossible to have reproducible builds (high severity)
  • Deprecated Dependencies (medium severity)
  • Depends on a fork (medium severity)
  • Dependencies with no build attestation (low severity)

Additionally, dirty-waters gives a supplier view on the dependency trees (who owns the different dependencies?)

dirty-waters is developed as part of the Chains research project.

Installation

To set up dirty-waters, follow these steps:

  1. Clone the repository:
git clone https://github.com/chains-project/dirty-waters.git
cd dirty-waters
  1. Set up a virtual environment and install dependencies:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cd tool

In alternative to virtual environments, you may also use the Nix flake present in this repository.

  1. Set up the GitHub API token (ideally, in a .env file):
export GITHUB_API_TOKEN=<your_token>

Usage

Run the tool using the following command structure:

python main.py -p <project_repo_name> -v <release_version_old> -s -pm <package_manager> [-vn <release_version_new>] [-d]

Arguments:

usage: main.py [-h] -p PROJECT_REPO_NAME -v RELEASE_VERSION_OLD [-vn RELEASE_VERSION_NEW] -s [-d] [-n] -pm {yarn-classic,yarn-berry,pnpm,npm,maven} [--pnpm-scope]

options:
  -h, --help            show this help message and exit
  -p PROJECT_REPO_NAME, --project-repo-name PROJECT_REPO_NAME
                        Specify the project repository name. Example: MetaMask/metamask-extension
  -v RELEASE_VERSION_OLD, --release-version-old RELEASE_VERSION_OLD
                        The old release tag of the project repository. Example: v10.0.0
  -vn RELEASE_VERSION_NEW, --release-version-new RELEASE_VERSION_NEW
                        The new release version of the project repository.
  -s, --static-analysis
                        Run static analysis and generate a markdown report of the project
  -d, --differential-analysis
                        Run differential analysis and generate a markdown report of the project
  -n, --name-match      Compare the package names with the name in the in the package.json file. This option will slow down the execution time due to the API rate limit of
                        code search.
  -pm {yarn-classic,yarn-berry,pnpm,npm,maven}, --package-manager {yarn-classic,yarn-berry,pnpm,npm,maven}
                        The package manager used in the project.
  --pnpm-scope          Extract dependencies from pnpm with a specific scope using 'pnpm list --filter <scope> --depth Infinity' command. Configure the scope in tool_config.py
                        file.

Example usage:

  1. Static analysis:
python3 main.py -p MetaMask/metamask-extension -v v11.11.0 -s -pm yarn-berry
  1. Differential analysis:
python3 main.py -p MetaMask/metamask-extension -v v11.11.0 -vn v11.12.0 -s -d -pm yarn-berry

Notes:

  • -v should be the version of GitHub release, e.g. for this release, the value should be v11.11.0, not Version 11.11.0 or 11.11.0.
  • The -s flag is required for all analyses.
  • When using -d for differential analysis, both -v and -vn must be specified.

Software Supply Chain Smell Support

dirty-waters currently supports package managers within the JavaScript and Java ecosystems. However, due to some constraints associated with the nature of the package managers, the tool may not be able to detect all the smells in the project. The following table shows the supported package managers and their associated smells:

Package Manager No Source Code Repository Invalid Source Code Repository URL No Release Tag Deprecated Dependency Depends on a Fork No Build Attestation
Yarn Classic Yes Yes Yes Yes Yes Yes
Yarn Berry Yes Yes Yes Yes Yes Yes
Pnpm Yes Yes Yes Yes Yes Yes
Npm Yes Yes Yes Yes Yes Yes
Maven Yes Yes Yes No Yes No

Academic Work

Other issues not handled by dirty-waters

  • Missing dependencies: simply run mvn/pip/... install :)
  • Bloated dependencies: we recommend DepClean for Java, depcheck for NPM
  • Version constraint inconsistencies: we recommend pipdeptree for Python

License

MIT License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dirty_waters-0.15.0.tar.gz (31.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dirty_waters-0.15.0-py3-none-any.whl (36.6 kB view details)

Uploaded Python 3

File details

Details for the file dirty_waters-0.15.0.tar.gz.

File metadata

  • Download URL: dirty_waters-0.15.0.tar.gz
  • Upload date:
  • Size: 31.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for dirty_waters-0.15.0.tar.gz
Algorithm Hash digest
SHA256 0b4d31d54ceb49ee7ecae7d83c1b0268cb306e8abbff1fcf6aea113f46bdfb2a
MD5 e8c8aadfc8f1eeab0c8840758df62931
BLAKE2b-256 2c055e492f9014a5147b2549a11311b852f1c0631e4c9c8a394a716be6228e49

See more details on using hashes here.

File details

Details for the file dirty_waters-0.15.0-py3-none-any.whl.

File metadata

  • Download URL: dirty_waters-0.15.0-py3-none-any.whl
  • Upload date:
  • Size: 36.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for dirty_waters-0.15.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2952d34b4434f15279c1afff4c5764950026a301c5507d16c02da2bc9839fa93
MD5 5335515b319a5dfb47eaadf0f08bea3b
BLAKE2b-256 754f8d83ee11d316abb5647f8e8f9cf3db26d7bfd28532243b887ff3144a504c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page