A tool to generate a custom Semgrep ruleset from multiple sources
Project description
Semgrep-search
Did you ever want to search for semgrep rules from the registry and test your codebase against rules from the all search results?
semgrep-search allows you to search for languages, categories and severities and outputs a single YAML file that you can use with semgrep.
The database powering semgrep-search is automatically built continuously and published to ghcr.io through
oras via this project.
Since version v1.1.2 there is also a web ui to inspect rules, create custom collections and generate run configurations: https://semgrep.cybaer.ninja/
Installation
The easiest installation method is by using pip or pipx
For example, using pip, semgrep-search can be installed by executing pip install semgrep-search
Usage
An alias for semgrep-search is automatically installed as sgs.
Running semgrep
Since version v1.1.2 it is possible to directly run semgrep using semgrep-search in one step.
semgrep-search run takes the same arguments as search
usage: semgrep-search run [-h] [--language LANGUAGE] [--category {best-practice,correctness,maintainability,performance,portability,security}] [--severity {ERROR,INFO,WARNING}] [--origin ORIGIN] [--include-empty] [-R [RULES]] [-C [CONFIG]] [--binary BINARY] [--keep-rules-file] [--update] [-v]
[--database DATABASE] [--text | --no-text] [--json] [--sarif] [--all] [--output OUTPUT] [--force]
[TARGET]
positional arguments:
TARGET The target to run semgrep against (if none is provided uses working directory
options:
-h, --help show this help message and exit
--language LANGUAGE, -l LANGUAGE
The language(s) to filter for. Separate multiple languages with comma or providing this argument multiple times
--category {best-practice,correctness,maintainability,performance,portability,security}, -c {best-practice,correctness,maintainability,performance,portability,security}
The category(/ies) to filter for. Specify multiple categories by providing this argument multiple times
--severity {ERROR,INFO,WARNING}, -s {ERROR,INFO,WARNING}
The severity(/ies) to filter for. Specify multiple severities by providing this argument multiple times
--origin ORIGIN, -o ORIGIN
The origin(s) to select rules from. Specify multiple origins by providing this argument multiple times or by separating them by comma
--include-empty, -e Include rules that do not specify a selected filter at all
-R [RULES], --rules [RULES]
Pre-generated set of rules to run semgrep with
-C [CONFIG], --config [CONFIG]
The run configuration string
--binary BINARY, -b BINARY
Specify the path to the semgrep binary (defaults to searching for "semgrep" in PATH)
--keep-rules-file If set, the temporary file containing the rules will not be deleted
--update, -u Force an update of the database
-v, --verbose Enable verbose logging
--database DATABASE Use a different location for the database
--text, --no-text Output a text file
--json Output a JSON file
--sarif Output a Sarif file
--all Output all available file formats
--output OUTPUT, -O OUTPUT
Output base filename (use - for stdout)
--force, -f If set, existing output file(s) will be overwritten
Inspecting the database
To view details about the database run sgs inspect.
Creating rulesets
To search for all rules that test csharp code and are categorized as security-relevant run:
sgs search -l csharp -c security
By default, semgrep-search will create a file rules.yaml in your current working directory.
Using -O you can specify a different path instead.
If the provided filename is -, semgrep-search write to STDOUT.
Updating rules
If semgrep-search does not find the database locally, the database will automatically be downloaded when the tool runs.
However, from time to time, there might be new rules added to the registry.
To update the rules, run semregp-search with --update, shorthand -u,
and the current state of the registry will be downloaded before searching for any rules.
Known issues
The tool found more rules than the website
It appears as if semgrep.dev renamed cs (C# in the YAML files) to csharp.
However, some old rules seem to exist as duplicates prefixed with cs and semgrep's web search filters these out.
I'm not quite sure why the JSON export still contains these and which other languages have been renamed in the past.
During database generation, languages will be normalized according to the table of languages from the Semgrep documentation.
The registry shows more rules when filtering for a language
There seems to be at least one language (C#) that is being used with two different names.
Therefore, semgrep-search contains a list of programming language aliases that the semgrep registry allows.
If you happen to be missing a rule, please check the language specified in the rule or open a ticket with details about the missing rule.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file semgrep_search-1.1.3.tar.gz.
File metadata
- Download URL: semgrep_search-1.1.3.tar.gz
- Upload date:
- Size: 28.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba8fb928e23320f6ff5a50cf83571938255c90d8b21246f9cdcba6d305c1211a
|
|
| MD5 |
83cd826c5c6fa8fcc2752a4dcf432d66
|
|
| BLAKE2b-256 |
0871a0c038e246278ee68e801d1724b8046f8dc193edc5ef1636d87261d20e68
|
Provenance
The following attestation bundles were made for semgrep_search-1.1.3.tar.gz:
Publisher:
release.yml on hnzlmnn/semgrep-search
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
semgrep_search-1.1.3.tar.gz -
Subject digest:
ba8fb928e23320f6ff5a50cf83571938255c90d8b21246f9cdcba6d305c1211a - Sigstore transparency entry: 331821070
- Sigstore integration time:
-
Permalink:
hnzlmnn/semgrep-search@50df8639a0ab1b1a756ea3bd6e936c100d7be6fa -
Branch / Tag:
refs/tags/v1.1.3 - Owner: https://github.com/hnzlmnn
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@50df8639a0ab1b1a756ea3bd6e936c100d7be6fa -
Trigger Event:
push
-
Statement type:
File details
Details for the file semgrep_search-1.1.3-py3-none-any.whl.
File metadata
- Download URL: semgrep_search-1.1.3-py3-none-any.whl
- Upload date:
- Size: 32.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63744dd24eb13d3cd36987bb33df765703d23741023a9f3f01a1e84c7686ec7c
|
|
| MD5 |
f054978b12e2393f9cc8957172d7c740
|
|
| BLAKE2b-256 |
43b91b0b0c3525ae900540700d2eb664ea3ee9a8186f02602cf2d9cc9e02eeaa
|
Provenance
The following attestation bundles were made for semgrep_search-1.1.3-py3-none-any.whl:
Publisher:
release.yml on hnzlmnn/semgrep-search
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
semgrep_search-1.1.3-py3-none-any.whl -
Subject digest:
63744dd24eb13d3cd36987bb33df765703d23741023a9f3f01a1e84c7686ec7c - Sigstore transparency entry: 331821095
- Sigstore integration time:
-
Permalink:
hnzlmnn/semgrep-search@50df8639a0ab1b1a756ea3bd6e936c100d7be6fa -
Branch / Tag:
refs/tags/v1.1.3 - Owner: https://github.com/hnzlmnn
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@50df8639a0ab1b1a756ea3bd6e936c100d7be6fa -
Trigger Event:
push
-
Statement type: