Construct PROV-O compliant provenance graphs.
Project description
README
The package supports the creation of PROV-O compliant provenance graphs.
The package requires Python 3.11.
Installation
You can install the package from the Python Package Index (PyPI):
pip install provo
Or by downloading this repo:
- Download and unzip the package
- Open Shell and cd to unzipped package
- Run
pip install -e .
(in the folder that containssetup.py
)
Contents
The package implements the PROV-O starting point classes Entity, Activity and Agent as Python classes with methods to establish starting point properties between them instances of these classes.
Features
Compliance
- The PROV-O classes Entity, Activity, and Agent are implemented as Python classes.
- The PROV-O properties wasGeneratedBy, wasDerivedFrom, wasAttributedTo, startedAtTime, used, wasInformedBy, endedAtTime, wasAssociatedWith, and actedOnBehalfOf are implemented as instance methods of their according classes.
- Attributes that are passes to these methods are type-checked to enforce compliance with PROV-O. itemName=ms-python.vscode-pylance).
- Node Ids are checked for validity.
- Accidental use of the same ID for different objects throws an error.
Ease of Use
- The package implements full type hint support, thus enabling rich support from the IDE.
- The classes
Provence_Ontology_Graph
,Entity
,Activity
, andAgent
can be printed to terminal in a user-friendly, readable way with the defaultprint()
command. - for some quick testing, objects of the classes
Entity
,Activity
, andAgent
can be instantiated with auto-generated Ids (although it's not recommended using this for production).
Interface to RDF via the rdflib package
- The graph's contents can be converted to an
rdflib.Graph
object. - The graph can be exported in various RDF serializations.
Manual
The package is centered around the class ProvenanceOntologyGraph. Entities, Activities, and Agents are added to this graph by using the according add-methods. Relations between the starting point classes are constructed by using the respective methods of the classes.
Create a Provenance Ontology Graph
The graph can be initialized with default or user defined attributes. The graph can be printed to the terminal with print(graph)
.
# ex1 - create a provenance graph
from provo import ProvOntologyGraph
# __defaults__
# namespace: str = "https://provo-example.org/",
# namespace_abbreviation: str = "",
# lang: str = "en"
provenance_graph = ProvOntologyGraph()
prov_ontology_graph = ProvOntologyGraph(
namespace='http://example.org#',
namespace_abbreviation="ex",
lang="en"
)
namespace=
- Default is
"https://provo-example.org/"
. - Has to be valid url, validation is currently performed with the validators package.
- Has to end with
/
or#
.
namespace_abbreviation=
- Default is
""
. - Used when converting to other models, such as RDF (-> prefixes)
- Only characters from the Latin alphabet are allowed.
- RDF core prefixes (owl, rdf, rdfs, xsd and xml) are prohibited from use.
Note Although not prohibited, the following prefixes are commonly uses and thus recommended to be avoided: brick, csvw, dc, dcat, dcmitype, cdterms, dcam, doap, foaf, geo, odrl, org, prof, prov, qb, sdo, sh, skos, sosa, ssn, time, vann and void.
lang=
- Default is
"en"
. - Used when converting to other models that support a lang tag.
- Has to be compliant with RFC 5646 (Phillips, A., Ed., and M. Davis, Ed., "Tags for Identifying Languages", BCP 47, RFC 5646, September 2009). Compliance is not validated!
Create Entities, Activities and Agents and define relation between them
The creation for the three starting term classes follows the same pattern. The classes only differ in their methods. PROV-O Classes are instantiated by using the add methods of the provenance graph class. Below you find an extensively commented version of the add_entity()
method.
def add_entity(self, id_string: str = "", label: str = "", description: str = "", namespace: str = "") -> Entity:
"""creates a new entity, adds it to the graph and returns it then"""
# the id of the PROV class objects is a combination of the
# namespace and the id_string. The method _handle_id() builds
# the actual id of the node, if checks whether the provided
# namespace-id combination is already used for a node in the graph.
# if no namespace is provided: default namespace is used,
# if no id is provided: id get automatically generated.
node_id = self._handle_id(namespace, id_string)
# the PROV class (in this case an Entity) is only created if everything with the ID is fine
entity = Entity(
# mandatory
node_id=node_id,
# optional
label=label,
# optional
description=description)
self._entities.append(entity)
return entity
The relations are defined by calling the respective methods of the PROV class instances.
Example use of the provenance graph's add-methods and the definition of a used-relation between an Activity and an Entity:
# ex2 - create entities, activities and agents,
# and define relation between them
entity = prov_ontology_graph.add_entity(
id_string="example_entity",
label="Example Entity")
activity = prov_ontology_graph.add_activity(
label="Anonymous activity",
description="An arbitrary activity."
)
activity.used(entity)
print(entity)
# id: http://example.org#example_entity
# label: Example Entity
# ---
print(activity)
# id: http://example.org#94021a6a-40cd-4c02-9571-33480488ff82
# label: Anonymous activity
# description: An arbitrary activity.
# used: ['Example Entity']
# ---
RDF interface
The graph can be directly serialized as RDF document or be converted to an rdflib Graph, for further manipulation.
# ex3 - serialize provenance graph as RDF document
prov_ontology_graph.serialize_as_rdf("manual_examples.ttl")
@prefix : <http://example.org#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
:94021a6a-40cd-4c02-9571-33480488ff82 a prov:Activity ;
rdfs:label "Anonymous activity"@en ;
rdfs:comment "An arbitrary activity."@en ;
prov:used :example_entity .
:example_entity a prov:Entity ;
rdfs:label "Example Entity"@en .
# ex4 - interface with rdflib
from rdflib import SKOS, Literal, URIRef
rdflib_graph = prov_ontology_graph.get_rdflib_graph()
rdflib_graph.bind("skos", SKOS)
rdflib_graph.add((
URIRef(entity.get_id()),
SKOS.prefLabel,
Literal(entity.get_label(), lang="en")
))
rdflib_graph.serialize("examples/rdflib_interface.ttl")
@prefix : <http://example.org#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
:94021a6a-40cd-4c02-9571-33480488ff82 a prov:Activity ;
rdfs:label "Anonymous activity"@en ;
rdfs:comment "An arbitrary activity."@en ;
prov:used :example_entity .
:example_entity a prov:Entity ;
rdfs:label "Example Entity"@en ;
skos:prefLabel "Example Entity"@en .
Comprehensive Examples
Code to create the PROV-O example 1
from datetime import datetime
from provo import ProvOntologyGraph
from rdflib import FOAF, RDF, Literal, URIRef
# create example from: https://www.w3.org/TR/prov-o/#narrative-example-simple-1
# create graph
prov_ontology_graph = ProvOntologyGraph(
namespace='http://example.org#',
namespace_abbreviation=""
)
# create entities
crime_data = prov_ontology_graph.add_entity(id_string='crimeData', label='Crime Data')
national_regions_list = prov_ontology_graph.add_entity(id_string='nationalRegionsList', label='National Regions List')
aggregated_by_regions = prov_ontology_graph.add_entity(id_string='aggregatedByRegions', label='Aggregated by Regions')
bar_chart = prov_ontology_graph.add_entity(id_string='bar_chart', label='Bar Chart')
# create activities
aggregation_activity = prov_ontology_graph.add_activity(id_string='aggregationActivity', label='Aggregation Activity')
illustration_activity = prov_ontology_graph.add_activity(id_string='illustrationActivity', label='Illustration Activity')
# create agents
government = prov_ontology_graph.add_agent(id_string='government', label='Government')
civil_action_group = prov_ontology_graph.add_agent(id_string='civil_action_group', label='Civil Action Group')
national_newspaper_inc = prov_ontology_graph.add_agent(id_string='national_newspaper_inc', label='National Newspaper Inc.')
derek = prov_ontology_graph.add_agent(id_string='derek', label='Derek')
# build relations
crime_data.was_attributed_to(government)
national_regions_list.was_attributed_to(civil_action_group)
aggregation_activity.used(crime_data)
aggregation_activity.used(national_regions_list)
aggregation_activity.started_at_time(datetime(2011, 7, 14, 1, 1, 1))
aggregation_activity.ended_at_time(datetime(2011, 7, 14, 2, 2, 2))
aggregation_activity.was_associated_with(derek)
aggregated_by_regions.was_generated_by(aggregation_activity)
aggregated_by_regions.was_attributed_to(derek)
illustration_activity.was_informed_by(aggregation_activity)
illustration_activity.used(aggregated_by_regions)
illustration_activity.was_associated_with(derek)
bar_chart.was_generated_by(illustration_activity)
bar_chart.was_derived_from(aggregated_by_regions)
bar_chart.was_attributed_to(derek)
derek.acted_on_behalf_of(national_newspaper_inc)
# print graph to terminal
print(prov_ontology_graph)
# use rdflib interface to add FOAF triples
rdflib_graph = prov_ontology_graph.get_rdflib_graph()
rdflib_graph.bind("foaf", FOAF)
rdflib_graph.add((
URIRef(government.get_id()),
RDF.type,
FOAF.Organization
))
rdflib_graph.add((
URIRef(civil_action_group.get_id()),
RDF.type,
FOAF.Organization
))
rdflib_graph.add((
URIRef(national_newspaper_inc.get_id()),
RDF.type,
FOAF.Organization
))
rdflib_graph.add((
URIRef(national_newspaper_inc.get_id()),
FOAF.name,
Literal(national_newspaper_inc.get_label(), lang="en")
))
rdflib_graph.add((
URIRef(derek.get_id()),
RDF.type,
FOAF.Person
))
rdflib_graph.add((
URIRef(derek.get_id()),
FOAF.givenName,
Literal(derek.get_label(), lang="en")
))
rdflib_graph.add((
URIRef(derek.get_id()),
FOAF.mbox,
URIRef("mailto:derek@example.org")
))
# serialize graph as rdf document
rdflib_graph.serialize('examples/provenance_graph_example.ttl')
Used Packages
- rdflib: https://rdflib.readthedocs.io/en/stable/, BSD License
- validators: https://github.com/python-validators/validators, MIT License
License
GNU General Public License v3.0
Contact
Arne Rümmler (arne.ruemmler@gmail.com)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file provo-0.2.1.tar.gz
.
File metadata
- Download URL: provo-0.2.1.tar.gz
- Upload date:
- Size: 26.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 124eef8389368dc6f30b755b728d156d810d49dbf7840806bf52d841f0a024bb |
|
MD5 | 2172b5d17d648644a8f59da07c8e3f47 |
|
BLAKE2b-256 | a45231747c80555842ede3903639af81adb0e8cfaa2d92f21ddf6ce7c85ab0b3 |
File details
Details for the file provo-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: provo-0.2.1-py3-none-any.whl
- Upload date:
- Size: 25.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.11.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d62a43fb77ad8af8f2821aa8c8b87c30b9fef7a6a56ac6d3d07254d3cd406cf6 |
|
MD5 | 8843e08cea5c6e71019ba307df09bcbb |
|
BLAKE2b-256 | 7a9b5f6740ddfe5712ca6bec7010eb1f5f2c0ab5148a95050dafcf338af6141e |