Extract provenance information (W3C PROV) from GitLab projects.
Project description
:seedling: gitlab2prov
: Extract Provenance from GitLab Projects
gitlab2prov
is a Python library and command line tool for extracting provenance information from GitLab projects.
The data model employed by gitlab2prov
has been modelled according to W3C PROV specification.
A representation of the model can be found in \docs
.
Note: Work in progress. Expect breaking changes until v1.0.
Installation :wrench:
Clone the project and use the provided setup.py
to install gitlab2prov
.
python setup.py install --user
Usage :computer:
gitlab2prov
can be used either as a command line script or as a Python lib.
To extract provenance from a project, follow these steps:
Instructions | Config Option |
---|---|
1. Obtain an API Token for the GitLab API (Token Guide) | --token |
2. Set the URL for the GitLab Project | --project_urls |
3. Set a rate limit for API requests | --rate_limit |
4. Choose a PROV serialization format | --format |
5. Choose whether to print to stdout or not | --quiet |
As a Command Line Script
gitlab2prov
can be configured either by command line flags or by using a config file.
Config File :clipboard:
An example of a configuration file can be found in config\examples
.
[GITLAB2PROV]
token = token
quiet = False
format = json
rate_limit = 10
[PROJECTS]
project_a = project_a_url
project_b = project_b_url
Command Line Flags :flags:
usage: GitLab2PROV [-h] [-p <string> [<string> ...]] [-t <string>] [-r <int>] [-c <string>]
[-f {provn,json,rdf,xml,dot}] [-q] [--aliases <string>] [--pseudonymize]
Extract provenance information from GitLab projects.
optional arguments:
-h, --help show this help message and exit
-p <string> [<string> ...], --project-urls <string> [<string> ...]
gitlab project urls
-t <string>, --token <string>
gitlab api access token
-r <int>, --rate-limit <int>
api client rate limit (in req/s)
-c <string>, --config-file <string>
config file path
-f {provn,json,rdf,xml,dot}, --format {provn,json,rdf,xml,dot}
provenance output format
-q, --quiet suppress output to stdout
--aliases <string> path to agent alias mapping json file
--pseudonymize pseudonymize agents
Provenance Output Formats
gitlab2prov
supports output formats that the prov
library provides:
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
References
Influencial Software for gitlab2prov
-
Martin Stoffers: "Gitlab2Graph", v1.0.0, October 13. 2019, GitHub Link, DOI 10.5281/zenodo.3469385
-
Quentin Pradet: "How do you rate limit calls with aiohttp?", GitHub Gist, MIT LICENSE
Influencial Papers for gitlab2prov
:
-
De Nies, T., Magliacane, S., Verborgh, R., Coppens, S., Groth, P., Mannens, E., & Van de Walle, R. (2013). Git2PROV: Exposing Version Control System Content as W3C PROV. In Poster and Demo Proceedings of the 12th International Semantic Web Conference (Vol. 1035, pp. 125–128).
-
Packer, H. S., Chapman, A., & Carr, L. (2019). GitHub2PROV: provenance for supporting software project management. In 11th International Workshop on Theory and Practice of Provenance (TaPP 2019).
Papers that refer to gitlab2prov
:
-
Andreas Schreiber, Claas de Boer (2020). Modelling Knowledge about Software Processes using Provenance Graphs and its Application to Git-based VersionControl Systems. In ICSEW'20: Proceedings of the IEEE/ACM 42nd Conference on Software Engineering Workshops (pp. 358–359).
-
Tim Sonnekalb, Thomas S. Heinze, Lynn von Kurnatowski, Andreas Schreiber, Jesus M. Gonzalez-Barahona, and Heather Packer (2020). Towards automated, provenance-driven security audit for git-based repositories: applied to germany's corona-warn-app: vision paper. In Proceedings of the 3rd ACM SIGSOFT International Workshop on Software Security from Design to Deployment (pp. 15–18).
-
Andreas Schreiber (2020). Visualization of contributions to open-source projects. In Proceedings of the 13th International Symposium on Visual Information Communication and Interaction. ACM, USA.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for gitlab2prov-0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea25a21f64b625ecf7750ee79e59b812b1d364cb2ba97a18f399245079fb973a |
|
MD5 | 927046517421df33421c37e60ebe2f09 |
|
BLAKE2b-256 | 4de8673b8b50e3f8f53b67121df8b2b8cb9362b14fd0897ed14a02f4c7303111 |