Skip to main content

Python SHACL Validator

Project description

pySHACL

A Python validator for SHACL.

PyPI version

This is a pure Python module which allows for the validation of RDF graphs against Shapes Constraint Language (SHACL) graphs. This module uses the rdflib Python library for working with RDF and is dependent on the OWL-RL Python module for OWL2 RL Profile-based expansion of data graphs.

This module is developed to adhere to the SHACL Recommendation:

Holger Knublauch; Dimitris Kontokostas. Shapes Constraint Language (SHACL). 20 July 2017. W3C Recommendation. URL: https://www.w3.org/TR/shacl/ ED: https://w3c.github.io/data-shapes/shacl/

Installation

Install with PIP (Using the Python3 pip installer pip3)

$ pip3 install pyshacl

Or in a python virtualenv (these example commandline instructions are for a Linux/Unix based OS)

$ python3 -m virtualenv --python=python3 --no-site-packages shaclvenv
$ source ./shaclvenv/bin/activate
$ pip3 install pyshacl

To exit the virtual enviornment:

$ deactivate

Command Line Use

For command line use:
(these example commandline instructions are for a Linux/Unix based OS)

pyshacl -s /path/to/shapesGraph.ttl -m -i rdfs -a -f human /path/to/dataGraph.ttl

Where

  • -s is an (optional) path to the shapes graph to use
  • -e is an (optional) path to an extra ontology graph to import
  • -i is the pre-inferencing option
  • -f is the ValidationReport output format (human = human-readable validation report)
  • -m enable the meta-shacl feature
  • -a enable SHACL Advanced Features

System exit codes are:
0 = DataGraph is Conformant
1 = DataGraph is Non-Conformant
2 = The validator encountered a RuntimeError (check stderr output for details)
3 = Not-Implemented; The validator encountered a SHACL feature that is not yet implemented.

Full CLI Usage options:

usage: pyshacl [-h] [-s [SHACL]] [-e [ONT]] [-i {none,rdfs,owlrl,both}] [-m]
               [--imports] [--abort] [-a] [-d] [-f {human,turtle,xml,json-ld,nt,n3}]
               [-df {auto,turtle,xml,json-ld,nt,n3}]
               [-sf {auto,turtle,xml,json-ld,nt,n3}]
               [-ef {auto,turtle,xml,json-ld,nt,n3}] [-o [OUTPUT]]
               DataGraph

Run the pySHACL validator from the command line.

positional arguments:
  DataGraph             The file containing the Target Data Graph.

optional arguments:
  -h, --help            show this help message and exit
  -s [SHACL], --shacl [SHACL]
                        A file containing the SHACL Shapes Graph.
  -e [ONT], --ont-graph [ONT]
                        A file path or URL to a docucument containing extra
                        ontological information to mix into the data graph.
  -i {none,rdfs,owlrl,both}, --inference {none,rdfs,owlrl,both}
                        Choose a type of inferencing to run against the Data
                        Graph before validating.
  -m, --metashacl       Validate the SHACL Shapes graph against the shacl-
                        shacl Shapes Graph before before validating the Data
                        Graph.
  --imports             Allow import of sub-graphs defined in statements with
                        owl:import.
  -a, --advanced        Enable support for SHACL Advanced Features.
  --abort               Abort on first error.
  -d, --debug           Output additional runtime messages, including violations that didn't 
                        lead to non-conformance.
  -f {human,turtle,xml,json-ld,nt,n3}, --format {human,turtle,xml,json-ld,nt,n3}
                        Choose an output format. Default is "human".
  -df {auto,turtle,xml,json-ld,nt,n3}, --data-file-format {auto,turtle,xml,json-ld,nt,n3}
                        Explicitly state the RDF File format of the input
                        DataGraph file. Default="auto".
  -sf {auto,turtle,xml,json-ld,nt,n3}, --shacl-file-format {auto,turtle,xml,json-ld,nt,n3}
                        Explicitly state the RDF File format of the input
                        SHACL file. Default="auto".
  -ef {auto,turtle,xml,json-ld,nt,n3}, --ont-file-format {auto,turtle,xml,json-ld,nt,n3}
                        Explicitly state the RDF File format of the extra
                        ontology file. Default="auto".
  -o [OUTPUT], --output [OUTPUT]
                        Send output to a file (defaults to stdout).

Python Module Use

For basic use of this module, you can just call the validate function of the pyshacl module like this:

from pyshacl import validate
r = validate(data_graph, shacl_graph=sg, ont_graph=og, inference='rdfs', abort_on_error=False, meta_shacl=False, debug=False)
conforms, results_graph, results_text = r

Where:

  • data_graph is an rdflib Graph object or file path of the graph to be validated
  • shacl_graph is an rdflib Graph object or file path or Web URL of the graph containing the SHACL shapes to validate with, or None if the SHACL shapes are included in the data_graph.
  • ont_graph is an rdflib Graph object or file path or Web URL a graph containing extra ontological information, or None if not required.
  • inference is a Python string value to indicate whether or not to perform OWL inferencing expansion of the data_graph before validation. Options are 'rdfs', 'owlrl', 'both', or 'none'. The default is 'none'.
  • abort_on_error (optional) a Python bool value to indicate whether or not the program should abort after encountering a validation error or to continue. Default is to continue.
  • meta_shacl (optional) a Python bool value to indicate whether or not the program should enable the Meta-SHACL feature. Default is False.
  • debug (optional) a Python bool value to indicate whether or not the program should emit debugging output text, including violations that didn't lead to non-conformance overall. So when debug is True don't judge conformance by absense of violation messages. Default is False.

Some other optional keyword variables available available on the validate function:

  • advanced: Enable SHACL Advanced Features
  • data_graph_format: Override the format detection for the given data graph source file.
  • shacl_graph_format: Override the format detection for the given shacl graph source file.
  • ont_graph_format: Override the format detection for the given extra ontology graph source file.
  • do_owl_imports: Enable the feature to allow the import of subgraphs using owl:import for the shapes graph and the ontology graph. Note, you explicitly cannot use this on the target data graph.
  • serialize_report_graph: Convert the report results_graph into a serialised representation (for example, 'turtle')
  • check_dash_result: Check the validation result against the given expected DASH test suite result.
  • check_sht_result: Check the validation result against the given expected SHT test suite result.

Return value:

  • a three-component tuple containing:
    • conforms a bool, indicating whether or not the data_graph conforms to the shacl_graph
    • results_graph an rdflib Graph object built according to the SHACL specification's Validation Report structure
    • results_text python string representing a verbose textual representation of the Validation Report

Errors

Under certain circumstances pySHACL can produce a Validation Failure. This is a formal error defined by the SHACL specification and is required to be produced as a result of specific conditions within the SHACL graph. If the validator produces a Validation Failure, the results_graph variable returned by the validate() function will be an instance of ValidationFailure. See the message attribute on that instance to get more information about the validation failure.

Other errors the validator can generate:

  • ShapeLoadError: This error is thrown when a SHACL Shape in the SHACL graph is in an invalid state and cannot be loaded into the validation engine.
  • ConstraintLoadError: This error is thrown when a SHACL Constraint Component is in an invalid state and cannot be loaded into the validation engine.
  • ReportableRuntimeError: An error occurred for a different reason, and the reason should be communicated back to the user of the validator.
  • RuntimeError: The validator encountered a situation that caused it to throw an error, but the reason does concern the user.

Unlike ValidationFailure, these errors are not passed back as a result by the validate() function, but thrown as exceptions by the validation engine and must be caught in a try ... except block. In the case of ShapeLoadError and ConstraintLoadError, see the str() string representation of the exception instance for the error message along with a link to the relevant section in the SHACL spec document.

Compatibility

PySHACL is a Python3 library. For best compatibility use Python v3.5 or greater. This library does not work on Python v2.7.x or below.

Features

A features matrix is kept in the FEATURES file.

Changelog

A comprehensive changelog is kept in the CHANGELOG file.

Benchmarks

This project includes a script to measure the difference in performance of validating the same source graph that has been inferenced using each of the four different inferencing options. Run it on your computer to see how fast the validator operates for you.

License

This repository is licensed under Apache License, Version 2.0. See the LICENSE deed for details.

Contributors

See the CONTRIBUTORS file.

Contacts

Project Lead:
Nicholas Car
Senior Experimental Scientist
CSIRO Land & Water, Environmental Informatics Group
Brisbane, Qld, Australia
nicholas.car@csiro.au
http://orcid.org/0000-0002-8742-7730

Lead Developer:
Ashley Sommer
Informatics Software Engineer
CSIRO Land & Water, Environmental Informatics Group
Brisbane, Qld, Australia
Ashley.Sommer@csiro.au
https://orcid.org/0000-0003-0590-0131

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyshacl-0.11.4.tar.gz (134.7 kB view details)

Uploaded Source

Built Distribution

pyshacl-0.11.4-py3-none-any.whl (83.1 kB view details)

Uploaded Python 3

File details

Details for the file pyshacl-0.11.4.tar.gz.

File metadata

  • Download URL: pyshacl-0.11.4.tar.gz
  • Upload date:
  • Size: 134.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.6.9

File hashes

Hashes for pyshacl-0.11.4.tar.gz
Algorithm Hash digest
SHA256 8ba1a80eff45fdd3488f104a407596e7b77d9471b36f5be9a5ff029258712108
MD5 7c57d307e2f6a6c31bdb3a5ce0bd13e4
BLAKE2b-256 25d5f6053d96b9fc2df226e69e1f219d0920ffd12d24c545d498dfeb282d25ef

See more details on using hashes here.

File details

Details for the file pyshacl-0.11.4-py3-none-any.whl.

File metadata

  • Download URL: pyshacl-0.11.4-py3-none-any.whl
  • Upload date:
  • Size: 83.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.6.9

File hashes

Hashes for pyshacl-0.11.4-py3-none-any.whl
Algorithm Hash digest
SHA256 5bc4340b429d0facb67463bf249e3fd248ad80c709d2c52cdbb97a262f3a9b0f
MD5 6dabb0c8efe0f844d52c980e68138968
BLAKE2b-256 6da64efcb1554168bd26caba7393269328f11fef078748758b2d2ebedd7633a6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page