Skip to main content

A light, dezentralized provenance tracking framework using the W3C PROV-O vocabulary

Project description

Python 3.6 GitHub license GitHub issues Docs passing

PROVIT is a light, dezentralized data provenance and documentation tool. It allows the user to track workflows and modifications of data-files.

PROVIT works completely decentralized, all information is stored in .prov files (as JSON-LD RDF graphs) along it’s corresponding data file in the file system. No additional database or server setup is needed.

A small subset of the W3C PROV-O vocabulary is implemented.

PROVIT aim to provided an easy to use interface for users who have never worked with provenance tracking before. If you feel limited by PROVIT you should have a look at more extensive implementations, e.g.: prov.

Full documentation is available under: provit.readthedocs.io.

Requirements

This software was tested on Linux with Python 3.5 and 3.6.

Installation

Installation via pip is recommended for end users. We strongly encourage end users to make use of a virtualenv.

pip

Clone the repository and create a virtual environment (optional) and install into with pip into the virtualenv.

$ mkvirtualenv provit
$ pip install provit

git / Development

Clone the repository and create a virtualenv.

$ git clone https://github.com/diggr/provit
$ mkvirtualenv provit

Install it with pip in editable mode

$ pip install -e ./provit

Usage

PROVIT provides a command line client which can be used to enrich any file based data with provenance information.

PROVIT also includes a (experimental) web-based interface (PROVIT Browser).

Command Line Client

Usage:

Open PROVIT Browser:

$ provit browser

Add provenace event to a file:

$ provit add FILEPATH [OPTIONS]

Options:

-a AGENT, --agent AGENT

Provenance information: agent (multiple=True)

--activity ACTIVITY

Provenance information: activity

-d DESCRIPTION, --desc DESCRIPTION

Provenance information: Description of the data manipulation process

-o ORIGIN, --origin ORIGIN

Provenance information: Data origin

-s SOURCES, --sources SOURCES

Provenance information: Source files (multiple=True)

--help

Show this message and exit.

Provenance Class

from provit import Provenance

# load prov data for a file, or create new prov for file
prov = Provenance(<filepath>)

# add provenance metadata
prov.add(agents=[ "agent" ], activity="activity", description="...")
prov.add_primary_source("primary_source")
prov.add_sources([ "filepath1", "filepath2" ])

# return provenance as json tree
prov_dict = prov.tree()

# save provenance metadata into "<filename>.prov" file
prov.save()

Roadmap

General roadmap of the next steps in development

  • Tests

  • Tutorials

  • Windows support

  • Agent management in PROVIT Browser

Overview

Authors:

P. Mühleder muehleder@ub.uni-leipzig.de, F. Rämisch raemisch@ub.uni-leipzig.de

License:

MIT

Copyright:

2018, Peter Mühleder and Universitätsbibliothek Leipzig

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

provit-1.0.1.tar.gz (346.9 kB view hashes)

Uploaded Source

Built Distribution

provit-1.0.1-py3-none-any.whl (350.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page