Query GitLab API for useful project and pipeline information.
Project description
gl-query
is a tool to query the GitLab API for useful project and pipeline information.
Installation / Setup
- To install:
pip install gl-query
-
Generate a GitLab Personal Access Token
- Make sure you store this token in a safe place. As a side-note, any results from the
gl-query
tool are dependent on both the permissions you give the token when you generate it as well as the access-rights you have on the GitLab instance.
- Make sure you store this token in a safe place. As a side-note, any results from the
-
Create the
~/.gl-query/config.cfg
file :
[config]
# Must include "/api/v4/"
url = https://<GITLAB_HOST>/api/v4/
token = YOUR_GITLAB_PERSONAL_ACCESS_TOKEN
Usage
Positional Arguments
Positional arguments subject/verb commands for the CLI utility. The format for position arguments is:
gl-query <VERB> <SUBJECT> --<OPTIONAL_FLAGS>
An example would be:
gl-query get projects
Verbs
Verbs are the first positional argument
Currently, the only supported verb is get
as the current purpose of this program is to retrieve information from the GitLab API and output it. For future, it may be worth adding a set
action in order to make modifications to the state of the GitLab instance. Some proposed actions would be to :
-
Trigger a pipeline
-
Modify user / group membership (access controls)
-
Opening / Modifying / Closing Merge Requests
-
Managing Tags / Releases
Subjects
Subjects are the second positional argument.
Optional Arguments (Flags)
There are multiple command-line flags compatibile with gl-query
.
--csv / -c
<OPTIONAL>
USAGE
: gl-query get projects --csv
This flag takes no arguments. It simply tells gl-query
to produce a CSV output of the query. When used without the --filename
flag it will write the CSV file to a default filename in this format: GL_PROJECT_ACTIVITY_SINCE_${DATE}.csv
--filename / -f
<OPTIONAL>
USAGE
: gl-query get projects --csv --filename all_projects.csv
The --filename
flag is simply the name of CSV file to write to, therefore, it is recommended to have a .csv
extension. Avoid using relative paths if you do choose to specify the path. If --filename
flag is used, must use the --csv
flag.
--language / -l
<OPTIONAL>
USAGE
: gl-query get projects --language Scala
Specify a programming language if you would like to filter GitLab projects by language. The first letter of the language must be capitalized. If flag is not used, all languages will be queried by default.
TODO
: Exception Handling.
--date-before
<OPTIONAL>
USAGE
: gl-query get projects --date-before 2019-03-04
Displays results before specified date. The argument must follow the YYYY-MM-DD format.
--date-after
<OPTIONAL>
USAGE
: gl-query get projects --date-after 2019-03-04
Displays results after specified date. The argument must follow the YYYY-MM-DD format.
--has-pipeline / -p
<OPTIONAL>
USAGE
: gl-query get projects -p
The flag takes no arguments and causes output to only contain results that have at least one associated pipeline. This is useful for viewing projects that have pipeline functionality.
--silent
<OPTIONAL>
USAGE
: gl-query get projects --lang java -c -f java.csv --silent
The flag blocks output to the console or standard out. This is good for security purposes (if this tool is being used in the GitLab CI/CD) or if the output is being written to CSV so you don't need console output.
--scan-type
<OPTIONAL>
USAGE
:
gl-query get scans --scan-type sast --date-after 2021-08-02 -l java
gl-query get scans --date-after 2021-08-02 --scan-type "sast,srcclear,dependency-scan,srcclr" -l scala
This flag must be used in conjunction with the get scans
action and outputs job information pertaining to the scan type(s). Currently there are three scan types supported for this query:
sast
(JS, Python, Go, Java, C++, sbt, Rust )dependency-scan
(JS, Python, Go, Java, sbt, Rust)srcclear
orsrcclr
(JS, Python, Go, Java, C++, sbt, Rust )
If you choose to select multiple scan types, remember to comma-delimit them surround them in double quotes as in the example above.
--version
Obtains version of gl-query
utility. This should correspond with a Git Tag.
Usage Examples
gl-query get project --project-id 3430
gl-query get projects -l scala --date-filter 2021-07-30 --csv --filename test.csv -p
gl-query get pipelines --project-id 4748
gl-query get pipeline --project-id 3430 --pipeline-id 247249
Dev Guide
Useful curl
commands for troubleshooting
curl --header "Authorization: Bearer <your_access_token>" "https://gitlab.gaikai.com/api/v4/projects"
To get x-total-pages
:
curl -s -I --header "Authorization: Bearer <your_access_token>" "https://gitlab.gaikai.org/api/v4/projects/"
Future Work
-
Testing (CI) : Integrate
gl-query_test.sh
script into pipeline orpre-commit
hook -
Implement the conditional query param (
date_before
versusdate_after
dilemna) here -
Implement GitLab Birthday in config file
-
For
get scans
action, implement:- Lookup by project-id / project-name
-
Implement custom exclude_projects functionality
-
API throttling -- backoff after a certain number of API Requests (need to find out GitLab API limits)
-
Minimize / consolidate number of API calls
-
Investigate more robust / centralized exception Handling
-
Flags
--top-level-group
--cgei-subgroup
--include-user-repositories
-
Implement search functionality:
https://gitlab.gaikai.org/api/v4/projects?search=nexus
-
Create a global query options function and child functions for each type of object (projects, pipelines, etc.)
-
Integrate
--csv
output for all actions
Developer Notes
Manually Deploying to PyPi
bumpversion --current-version <CURRENT_VERSION> minor setup.py gl-query/__init__.py --allow-dirty
python3 setup.py sdist bdist_wheel
twine upload dist/gl-query-<VERSION_NUMBER>*
twine upload dist/gl_query-<VERSION_NUMBER>*
Query Options / Parameters
-
Query parameters are generally of concern when executing listing operations (i.e. list projects, groups, users, pipelines, jobs, merge requests, etc...)
-
Query parameters are generally not used when dealing with a specific project or pipeline.
-
GLOBAL Query Parameters
per_page
sort
order_by
-
Projects Query Parameters
archived=false
last_activity_after
last_activity_before
with_programming_language
order_by=last_activity_at
-
Pipelines
order_by=updated_at
yaml_errors=false
scope=finished
updated_after
updated_before
-
To implement
pipeline_id
filter onget scans
:- Iterate through pipeline pages for given project
- Piggy back off the get_pipeline method for job info
- Maintain running sum of target scans
Using Date Filters
tldr; Do not use both date_before
and date_after
in projects queries.
Scenario:
-
Today's date is August 3, 2021
-
Use Case: You are searching for
sast
scans between July 15, 2021 and July 30, 2021 -
Project X has a pipeline which ran on July 15, 2021 with a
sast
scan. -
If you execute a
gl-query
with--date-before 2021-07-30
and--date-after 2021-07-15
, this will only include projects that were last active between July 15, 2021 and July 30, 2021 as it is using the&last_activity
query parameters. -
If Project X, was updated after
2021-07-30
for any reason, it will not show up in the query results since it's not between the specified date range, therefore excluding a perfectly legitimatesast
scan from the results. -
The solution to this is to not use both the
&last_activity_before
and&updated_before
query parameters when making requests toprojects
API endpoints respectively. Instead of using thedate_before
param, we will use just thedate_after
param. This will ensure that all projects and pipelines will be included in your query. -
Although this avoids the false negative problem, it significantly increases query time since it will query ALL projects/pipelines AFTER the given date. It is possible to speed this up by implementing the following logic:
-
Conditionally use either the
date_before
ordate_after
for the query params. -
You will need to introduce a variable called GL_BDAY (basically the first day a project was created)
if (DATE_BEFORE - GL_BDAY ) > ( TODAY - DATE_AFTER): Use date_after for `&last_activity_after` and `&updated_after` AND do not use date_before else: Use the date_before for `&last_activity_before` and `&updated_before`
- This approach has it's own risks as GL project/pipeline activity may be skewed such that activity was low for the first couple months / year, so keep this into account if you choose to implement this.
-
Useful Queries
All Scans & All Languages since a certain date
gl-query get scans --all-languages --scan-type "sast, dependency-scan, srcclr, srcclear" --date-after 2021-07-04 --csv -f all_scans_since_07_04_21.csv
Note, we query for both srcclear
and srcclr
since they are both used as job names in GitLab CI.
Java SAST Scans ( w/ --date-after
)
# Java SAST Scans (July 3 - August 3)
gl-query get scans --scan-type sast -l Java --date-after 2021-07-03
RESULT:
Total SAST Scan Jobs after 2021-07-03 : 576
ALL sast
Scans ( w/ --date-after
)
gl-query get scans --scan-type sast --all-languages --date-after 2021-07-03
ALL dependency-scan
( w/ --date-after
)
gl-query get scans --scan-type dependency-scan --all-languages --date-after 2021-07-03
ALL srcclr
or srcclear
( w/ --date-after
)
gl-query get scans --scan-type srcclr --all-languages --date-after 2021-07-03
gl-query get scans --scan-type srcclear --all-languages --date-after 2021-07-03
[DANGEROUS] ALL sast
scans EVER
# This will look through every project since GL_BDAY (JANUARY 1, 2019) which could generate tremendous load against the GitLab instance
TODAY=$(date '+%Y-%m-%d')
gl-query get scans --scan-type sast --all-languages --date-before $TODAY
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.