Skip to main content

Choppr is a plugin that is meant to reduce the size of a software's Software Bill of Materials (SBOM).

Project description

Choppr

A Hoppr plugin to filter unused components out of the delivered SBOM using strace results.

Choppr refines the components in a Software Bill of Materials (SBOM). It does not replace SBOM generation tools. Mainly, Choppr analyses a build or runtime to verify which components are used, and remove the SBOM components not used. Starting with file accesses, it works backwards from how an SBOM generation tool typically would. For example SBOM generators use the yum database to determine which packages yum installed. Choppr looks at all the files accessed and queries sources like yum to determine the originating package.

Other intended results include:

  • Reducing installed components. Size is optimized. The number of vulnerabilities is reduced. The less tools available to an attacker the better.
  • Creating a runtime container from the build container
  • Detecting files without corresponding SBOM components

Configuration

manifest.yml

You must list the RPM repositories used on your system in the manifest.yml file, for example:

repositories:
  rpm:
    - url: http://mirrorlist.rockylinux.org/?arch=x86_64&repo=BaseOS-8
    - url: http://mirrorlist.rockylinux.org/?arch=x86_64&repo=AppStream-8
    - url: https://mirrors.rockylinux.org/powertools/rocky/8/
    - url: https://mirrors.rockylinux.org/extra/rocky/8/
    - url: https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

To obtain this list, use the following command:

# For RHEL 8 and later
dnf repolist --verbose

# For RHEL 7 and earlier
yum repolist --verbose

With the output from one of these commands, you should be able to find the URLs to the repositories used on your system.

transfer.yml

You must add choppr as a plugin and configure it in the transfer.yml file, for example:

Filter:
  plugins:
    - name: choppr.plugin
    config:
      strace_results: strace-output.txt
      certificates:
        - url: my.privaterepo.com
          certificate: /certs/combined.pem
      strace_regex_excludes:
        - ^.*<project-name>.*$
        - ^.*\.(c|cpp|cxx|h|hpp|o|py|s)$
        - ^/usr/share/pkgconfig$
        - ^/tmp$
        - ^bin$
        - ^.*\.git.*$
        - ^.*(\.\.)+.*$
        - ^.*(CMakeFiles.*|\.cmake)$

Configuration Variables

mode

The operation mode to use for Choppr.

Options:

  • run - Standard operation mode to filter components of an SBOM
  • cache - Only create the cache and output the archive for it

Default: run

Type: OperatingMode

Example Usage:

mode: cache

Conditional Configuration Variables

strace_results

The path to the output file created when running strace on your build or runtime executable.

This must be provided when mode is set to run.

This file can be creating using the following command to wrap your build script or runtime executable. The strace tool must be installed on your system separately from choppr.

strace -f -e trace=file -o "strace_output.txt" <build script/runtime executable>

Type: Path | None

Default: None

Example Usage:

strace_results: strace_output.txt

Optional Configuration Variables

allow_partial_filename_match

Allow partial matching for filenames when comparing strace files to files provided by remote repository packages.

This may be useful when symlinks are used for libraries. This is currently only implemented for RPMs.

Type: bool

Default: false

Example Usage:

allow_partial_filename_match: true

allow_version_mismatch

Allow version numbers to be mismatched when comparing SBOM packages to remote repository packages.

Type: bool

Default: false

Example Usage:

allow_version_mismatch: true

allowlist

A dictionary with packages to always keep in the SBOM.

The keys are purl types, and the values are a list of packages. A package has two members, name and version, both are regex patterns.

Type:

allowlist: # dict[PurlType, list[PackagePattern]]
  _purl_type_: # str (deb, npm, rpm, ...)
    - name: regex
      version: regex
    ...
  ...

Default: {}

Example Usage:

allowlist:
  deb:
    - name: ".*"
      version: ".*"
  generic:
    - name: "^python$"
      version: "^3.10"

archive_cache

Enable archive_cache to archive the cache directory when Choppr finishes running in run mode.

This has no effect in cache mode, as the archive will always be created in that mode.

Type: bool

Default: false

Example Usage:

archive_cache: true

cache_dir

The path for the cache directory where Choppr will output temporary and downloaded files.

Type: Path

Default: ./.cache/choppr

Example Usage:

cache_dir: /tmp/choppr

cache_input

The path for the cache directory where Choppr will output temporary and downloaded files.

Type: Path | None

Default: None

Example Usage:

cache_input: /backup/choppr-cache.tar.gz

cache_timeout

The timeout for local cache files, like DEB packages, that aren't traced to a checksum, like RPM packages.

Expects a number followed by a unit (d = days, h = hours, m = minutes, s = seconds).

Type: str | bool

Default: 7d

Example Usage:

cache_timeout: 24h

certificates

A list of objects with a url and certificate key that is used to access the provided url when a self signed certificate needs to be used.

Type: list[dict[str, str]]

Default: []

Example Usage:

certificates:
  - url: my.privaterepo.com
    certificate: /certs/combined.pem
  - ...

clear_cache

Enable clear_cache to delete the cache directory when Choppr finishes running.

Type: bool

Default: false

Example Usage:

clear_cache: true

deb_repositories

A list of DEB repositories with the URL, distributions, and components to include.

Type: list[DebianRepository]

Default: []

When defining repositories, the components list defaults to ["main", "restricted", "universe", "multiverse"].

Example Usage:

deb_repositories:
  - url: http://archive.ubuntu.com/ubuntu/
    distributions:
      - name: jammy
      - name: jammy-updates
      - name: jammy-backports
  - url: http://apt.llvm.org/jammy/
    distributions:
      - name: llvm-toolchain-jammy-16
        components:
          - main
  - ...

delete_excluded

Disable delete_excluded to keep RPMs that are discovered to be unnecessary and marked as excluded.

Type: bool

Default: true

Example Usage:

delete_excluded: false

denylist

A dictionary with packages to always remove from the SBOM.

The keys are purl types, and the values are a list of packages. A package has two members, name and version, both are regex patterns.

Type:

denylist: # dict[PurlType, list[PackagePattern]]
  _purl_type_: # str (deb, npm, rpm, ...)
    - name: regex
      version: regex
    ...
  ...

Default: {}

Example Usage:

denylist:
  deb:
    - name: "cmake"
      version: "3.22"
  npm:
    - name: ".*"
      version: ".*"

http_limits

Limits to enforce when performing HTTP requests within Choppr.

  • retries - The number of times to retry the request if it fails
  • retry_interval - The number of seconds to wait before retrying the request
  • timeout - The number of seconds to wait for a request to complete before timing out

Type:

http_limits:  # HttpLimits
  retries: PositiveInt
  retry_interval: PositiveFloat
  timeout: PositiveFloat

Default:

http_limits:
  retries: 3
  retry_interval: 5
  timeout: 60

Example Usage:

http_limits:
  retries: 10
  retry_interval: 30
  timeout: 300

keep_essential_os_components

Keep components that are essential to the operating system, to include the operating system component.

Type: bool

Default: false

Example Usage:

keep_essential_os_components: true

output_files

Specify the paths for output files.

Type:

output_files:
  cache_archive: Path
  excluded_components: # dict[PurlType, ExcludedPackageFile]
    _purl_type_: # str (deb, npm, rpm, ...)
      file: Path
      component_format: str # optional
    ...

Defaults:

output_files:  # OutputFiles
  cache_archive: choppr-cache.tar.gz
  excluded_components:
    <purl_type>:
      file: choppr-excluded-components-<purl_type>.txt
      component_format: <excluded_component_format>
    ...

For excluded_component_format the default value is {name}={version} except for NPM, and RPM. Those are as follows:

NPM: "{name}@{version}"
RPM: "{name}-{version}"

Example Usage:

output_files:
  cache_archive: output/choppr_cache.tar.gz
  excluded_components:
    generic:
      file: output/excluded_generic.csv
      component_format: "{name},{version}"
    npm:
      file: output/excluded_npm.txt
    rpm:
      file: output/excluded_rpm.txt

recursion_limit

A positive integer that will limit the number of recursive calls to use when checking for nested package dependencies.

Type: PositiveInt

Default: 10

Example Usage:

recursion_limit: 20

strace_regex_excludes

An array of regex strings, used to filter the strace input. The example below shows some of the recommended regular expressions.

Type: list[str]

Default: []

Example Usage:

strace_regex_excludes:
  - "^.*project-name.*$"              # Ignore all files containing the project name to exclude source files
  - "^.*\.(c|cpp|cxx|h|hpp|o|py|s)$"  # Ignore source, header, object, and script files
  - "^/usr/share/pkgconfig$"          # Ignore pkgconfig, which is included/modified by several RPMs
  - "^/tmp$"                          # Ignore the tmp directory
  - "^bin$"                           # Ignore overly simple files, that will be matched by most RPMs
  - "^.*\.git.*$"                     # Ignore all hidden git directories and files
  - "^.*(\.\.)+.*$"                   # Ignore all relative paths containing '..'
  - "^.*(CMakeFiles.*|\.cmake)$"      # Ignore all CMake files

Approaches

How to use Choppr depends on your project and needs. Consider the following use cases and their recommended approaches. Note, this references CISA defined SBOM types.

Build SBOM of software product

The user provides the required content. Choppr determines which comoponents were used during the build. The exclude list tells Choppr to remove components like CMake, because the user is certain no CMake software was built into their product. An uninstall script is generated. Building again after removing these components verifies no required components were lost.

Create runtime image and Runtime SBOM from build image

Choppr uses a multistage build to ADD the files used. Optionally metadata such as the yum database can be kept. The additional include list can be used to specify dynamically linked libraries, necessary services, or any other necessary components that were not exercised during build. This will also be reflected in the SBOM components.

Create Runtime SBOM from runtime image

Similar to analyzing a build, Choppr can analyze a runtime. Note, to if this is used to describe a delivery, it should be merged with the Build SBOM.

Specificaitons for developers

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

choppr-0.5.0.tar.gz (32.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

choppr-0.5.0-py3-none-any.whl (43.5 kB view details)

Uploaded Python 3

File details

Details for the file choppr-0.5.0.tar.gz.

File metadata

  • Download URL: choppr-0.5.0.tar.gz
  • Upload date:
  • Size: 32.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.0 CPython/3.10.18 Linux/5.15.154+

File hashes

Hashes for choppr-0.5.0.tar.gz
Algorithm Hash digest
SHA256 76e5f05e132cc166dd5fdea9bcd6531e00fa99e4dfb96921d9889be1b53f2e33
MD5 34b4d9641f191139184a97252ae3458b
BLAKE2b-256 e0bc541aa5cd268d8af5a74462c7eb505598093648577025fb5fc3c93a3e0ecf

See more details on using hashes here.

File details

Details for the file choppr-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: choppr-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 43.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.0 CPython/3.10.18 Linux/5.15.154+

File hashes

Hashes for choppr-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 46ca25aa55a0a31625d090739264bbeaf252971e6ccf6bc2d8f672d783c907ff
MD5 abe226a539c832d790275360805660bd
BLAKE2b-256 e3f057ba6aca72def4011bf6d06c13015751ccd4ac322f06bb924681dcb498b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page