Choppr is a plugin that is meant to reduce the size of a software's Software Bill of Materials (SBOM).
Project description
Choppr
A Hoppr plugin to filter unused components out of the delivered SBOM using strace results.
Choppr refines the components in a Software Bill of Materials (SBOM). It does not replace SBOM generation tools. Mainly, Choppr analyses a build or runtime to verify which components are used, and remove the SBOM components not used. Starting with file accesses, it works backwards from how an SBOM generation tool typically would. For example SBOM generators use the yum database to determine which packages yum installed. Choppr looks at all the files accessed and queries sources like yum to determine the originating package.
Other intended results include:
- Reducing installed components. Size is optimized. The number of vulnerabilities is reduced. The less tools available to an attacker the better.
- Creating a runtime container from the build container
- Detecting files without corresponding SBOM components
Configuration
manifest.yml
You must list the RPM repositories used on your system in the
manifest.yml file, for example:
repositories:
rpm:
- url: http://mirrorlist.rockylinux.org/?arch=x86_64&repo=BaseOS-8
- url: http://mirrorlist.rockylinux.org/?arch=x86_64&repo=AppStream-8
- url: https://mirrors.rockylinux.org/powertools/rocky/8/
- url: https://mirrors.rockylinux.org/extra/rocky/8/
- url: https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
To obtain this list, use the following command:
# For RHEL 8 and later
dnf repolist --verbose
# For RHEL 7 and earlier
yum repolist --verbose
With the output from one of these commands, you should be able to find the URLs to the repositories used on your system.
transfer.yml
You must add choppr as a plugin and configure it in the
transfer.yml file, for example:
Filter:
plugins:
- name: choppr.plugin
config:
strace_results: strace-output.txt
certificates:
- url: my.privaterepo.com
certificate: /certs/combined.pem
strace_regex_excludes:
- ^.*<project-name>.*$
- ^.*\.(c|cpp|cxx|h|hpp|o|py|s)$
- ^/usr/share/pkgconfig$
- ^/tmp$
- ^bin$
- ^.*\.git.*$
- ^.*(\.\.)+.*$
- ^.*(CMakeFiles.*|\.cmake)$
Configuration Variables
mode
The operation mode to use for Choppr.
Options:
run- Standard operation mode to filter components of an SBOMcache- Only create the cache and output the archive for it
Default: run
Type: OperatingMode
Example Usage:
mode: cache
Conditional Configuration Variables
strace_results
The path to the output file created when running strace on your build or runtime executable.
This must be provided when mode is set to run.
This file can be creating using the following command to wrap your build script or runtime executable. The strace tool
must be installed on your system separately from choppr.
strace -f -e trace=file -o "strace_output.txt" <build script/runtime executable>
Type: Path | None
Default: None
Example Usage:
strace_results: strace_output.txt
Optional Configuration Variables
allow_partial_filename_match
Allow partial matching for filenames when comparing strace files to files provided by remote repository packages.
This may be useful when symlinks are used for libraries. This is currently only implemented for RPMs.
Type: bool
Default: false
Example Usage:
allow_partial_filename_match: true
allow_version_mismatch
Allow version numbers to be mismatched when comparing SBOM packages to remote repository packages.
Type: bool
Default: false
Example Usage:
allow_version_mismatch: true
allowlist
A dictionary with packages to always keep in the SBOM.
The keys are purl types, and the values are a list of packages. A package has two members, name and version, both are regex patterns.
Type:
allowlist: # dict[PurlType, list[PackagePattern]]
_purl_type_: # str (deb, npm, rpm, ...)
- name: regex
version: regex
...
...
Default: {}
Example Usage:
allowlist:
deb:
- name: ".*"
version: ".*"
generic:
- name: "^python$"
version: "^3.10"
archive_cache
Enable archive_cache to archive the cache directory when Choppr finishes running in run mode.
This has no effect in cache mode, as the archive will always be created in that mode.
Type: bool
Default: false
Example Usage:
archive_cache: true
cache_dir
The path for the cache directory where Choppr will output temporary and downloaded files.
Type: Path
Default: ./.cache/choppr
Example Usage:
cache_dir: /tmp/choppr
cache_input
The path for the cache directory where Choppr will output temporary and downloaded files.
Type: Path | None
Default: None
Example Usage:
cache_input: /backup/choppr-cache.tar.gz
cache_timeout
The timeout for local cache files, like DEB packages, that aren't traced to a checksum, like RPM packages.
Expects a number followed by a unit (d = days, h = hours, m = minutes, s = seconds).
Type: str | bool
Default: 7d
Example Usage:
cache_timeout: 24h
certificates
A list of objects with a url and certificate key that is used to access the provided url when a self signed certificate needs to be used.
Type: list[dict[str, str]]
Default: []
Example Usage:
certificates:
- url: my.privaterepo.com
certificate: /certs/combined.pem
- ...
clear_cache
Enable clear_cache to delete the cache directory when Choppr finishes running.
Type: bool
Default: false
Example Usage:
clear_cache: true
deb_repositories
A list of DEB repositories with the URL, distributions, and components to include.
Type: list[DebianRepository]
Default: []
When defining repositories, the components list defaults to ["main", "restricted", "universe", "multiverse"].
Example Usage:
deb_repositories:
- url: http://archive.ubuntu.com/ubuntu/
distributions:
- name: jammy
- name: jammy-updates
- name: jammy-backports
- url: http://apt.llvm.org/jammy/
distributions:
- name: llvm-toolchain-jammy-16
components:
- main
- ...
delete_excluded
Disable delete_excluded to keep RPMs that are discovered to be unnecessary and marked as excluded.
Type: bool
Default: true
Example Usage:
delete_excluded: false
denylist
A dictionary with packages to always remove from the SBOM.
The keys are purl types, and the values are a list of packages. A package has two members, name and version, both are regex patterns.
Type:
denylist: # dict[PurlType, list[PackagePattern]]
_purl_type_: # str (deb, npm, rpm, ...)
- name: regex
version: regex
...
...
Default: {}
Example Usage:
denylist:
deb:
- name: "cmake"
version: "3.22"
npm:
- name: ".*"
version: ".*"
http_limits
Limits to enforce when performing HTTP requests within Choppr.
retries- The number of times to retry the request if it failsretry_interval- The number of seconds to wait before retrying the requesttimeout- The number of seconds to wait for a request to complete before timing out
Type:
http_limits: # HttpLimits
retries: PositiveInt
retry_interval: PositiveFloat
timeout: PositiveFloat
Default:
http_limits:
retries: 3
retry_interval: 5
timeout: 60
Example Usage:
http_limits:
retries: 10
retry_interval: 30
timeout: 300
keep_essential_os_components
Keep components that are essential to the operating system, to include the operating system component.
Type: bool
Default: false
Example Usage:
keep_essential_os_components: true
output_files
Specify the paths for output files.
Type:
output_files:
cache_archive: Path
excluded_components: # dict[PurlType, ExcludedPackageFile]
_purl_type_: # str (deb, npm, rpm, ...)
file: Path
component_format: str # optional
...
Defaults:
output_files: # OutputFiles
cache_archive: choppr-cache.tar.gz
excluded_components:
<purl_type>:
file: choppr-excluded-components-<purl_type>.txt
component_format: <excluded_component_format>
...
For excluded_component_format the default value is {name}={version} except for NPM, and RPM. Those are as follows:
NPM: "{name}@{version}"
RPM: "{name}-{version}"
Example Usage:
output_files:
cache_archive: output/choppr_cache.tar.gz
excluded_components:
generic:
file: output/excluded_generic.csv
component_format: "{name},{version}"
npm:
file: output/excluded_npm.txt
rpm:
file: output/excluded_rpm.txt
recursion_limit
A positive integer that will limit the number of recursive calls to use when checking for nested package dependencies.
Type: PositiveInt
Default: 10
Example Usage:
recursion_limit: 20
strace_regex_excludes
An array of regex strings, used to filter the strace input. The example below shows some of the recommended regular expressions.
Type: list[str]
Default: []
Example Usage:
strace_regex_excludes:
- "^.*project-name.*$" # Ignore all files containing the project name to exclude source files
- "^.*\.(c|cpp|cxx|h|hpp|o|py|s)$" # Ignore source, header, object, and script files
- "^/usr/share/pkgconfig$" # Ignore pkgconfig, which is included/modified by several RPMs
- "^/tmp$" # Ignore the tmp directory
- "^bin$" # Ignore overly simple files, that will be matched by most RPMs
- "^.*\.git.*$" # Ignore all hidden git directories and files
- "^.*(\.\.)+.*$" # Ignore all relative paths containing '..'
- "^.*(CMakeFiles.*|\.cmake)$" # Ignore all CMake files
Approaches
How to use Choppr depends on your project and needs. Consider the following use cases and their recommended approaches. Note, this references CISA defined SBOM types.
Build SBOM of software product
The user provides the required content. Choppr determines which comoponents were used during the build. The exclude list tells Choppr to remove components like CMake, because the user is certain no CMake software was built into their product. An uninstall script is generated. Building again after removing these components verifies no required components were lost.
Create runtime image and Runtime SBOM from build image
Choppr uses a multistage build to ADD the files used. Optionally metadata such as the yum database can be kept. The
additional include list can be used to specify dynamically linked libraries, necessary services, or any other necessary
components that were not exercised during build. This will also be reflected in the SBOM components.
Create Runtime SBOM from runtime image
Similar to analyzing a build, Choppr can analyze a runtime. Note, to if this is used to describe a delivery, it should be merged with the Build SBOM.
Specificaitons for developers
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file choppr-0.5.0.tar.gz.
File metadata
- Download URL: choppr-0.5.0.tar.gz
- Upload date:
- Size: 32.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.0 CPython/3.10.18 Linux/5.15.154+
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76e5f05e132cc166dd5fdea9bcd6531e00fa99e4dfb96921d9889be1b53f2e33
|
|
| MD5 |
34b4d9641f191139184a97252ae3458b
|
|
| BLAKE2b-256 |
e0bc541aa5cd268d8af5a74462c7eb505598093648577025fb5fc3c93a3e0ecf
|
File details
Details for the file choppr-0.5.0-py3-none-any.whl.
File metadata
- Download URL: choppr-0.5.0-py3-none-any.whl
- Upload date:
- Size: 43.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.0 CPython/3.10.18 Linux/5.15.154+
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
46ca25aa55a0a31625d090739264bbeaf252971e6ccf6bc2d8f672d783c907ff
|
|
| MD5 |
abe226a539c832d790275360805660bd
|
|
| BLAKE2b-256 |
e3f057ba6aca72def4011bf6d06c13015751ccd4ac322f06bb924681dcb498b6
|