Skip to main content

Extract provenance information (W3C PROV) from GitLab projects.

Project description

:seedling: gitlab2prov: Extract Provenance from GitLab Projects

License: MIT made-with-python PyPI version fury.io DOI Git commits (by Cauldron.io) Issues created (by Cauldron.io) Issues closed (by Cauldron.io)

gitlab2prov is a Python library and command line tool for extracting provenance information from GitLab projects.

The data model employed by gitlab2prov has been modelled according to W3C PROV PROV specification.
A representation of the model can be found in \docs.

Note: Work in progress. Expect breaking changes until v1.0.

Installation :wrench:

Clone the project and use the provided setup.py to install gitlab2prov.

python setup.py install --user

Usage :computer:

gitlab2prov can be used either as a command line script or as a Python lib.

To extract provenance from a project, follow these steps:

Instructions Config Option
1. Obtain an API Token for the GitLab API (Token Guide) --token
2. Set the URL for the GitLab Project --project_urls
3. Set a rate limit for API requests --rate_limit
4. Choose a PROV serialization format --format
5. Choose whether to print to stdout or not --quiet

As a Command Line Script

gitlab2prov can be configured either by command line flags or by using a config file.

Config File :clipboard:

An example of a configuration file can be found in config\examples.

[GITLAB2PROV]
token = token
quiet = False
format = json
rate_limit = 10

[PROJECTS]
project_a = project_a_url
project_b = project_b_url
Command Line Flags :flags:
usage: GitLab2PROV [-h] [-p <string> [<string> ...]] [-t <string>] [-r <int>] [-c <string>]
                   [-f {provn,json,rdf,xml,dot}] [-q] [--aliases <string>] [--pseudonymize]

Extract provenance information from GitLab projects.

optional arguments:
  -h, --help            show this help message and exit
  -p <string> [<string> ...], --project-urls <string> [<string> ...]
                        gitlab project urls
  -t <string>, --token <string>
                        gitlab api access token
  -r <int>, --rate-limit <int>
                        api client rate limit (in req/s)
  -c <string>, --config-file <string>
                        config file path
  -f {provn,json,rdf,xml,dot}, --format {provn,json,rdf,xml,dot}
                        provenance output format
  -q, --quiet           suppress output to stdout
  --aliases <string>    path to agent alias mapping json file
  --pseudonymize        pseudonymize agents

Provenance Output Formats

gitlab2prov supports output formats that the prov library provides:

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

References

Influencial Software for gitlab2prov

  • Martin Stoffers: "Gitlab2Graph", v1.0.0, October 13. 2019, GitHub Link, DOI 10.5281/zenodo.3469385

  • Quentin Pradet: "How do you rate limit calls with aiohttp?", GitHub Gist, MIT LICENSE

Influencial Papers for gitlab2prov:

Papers that refer to gitlab2prov:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gitlab2prov-0.5.tar.gz (29.3 kB view hashes)

Uploaded Source

Built Distribution

gitlab2prov-0.5-py3-none-any.whl (30.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page