Skip to main content

A SHACL validator capable of planning the traversal and execution of the validation of a shape schema to detect violations early.

Project description

Tests Latest Release Docker Image License: GPL v3

Python Versions Package Format Package Status Package Version

Logo Trav-SHACL

We present Trav-SHACL, a SHACL engine capable of planning the traversal and execution of a shape schema in a way that invalid entities are detected early and needless validations are minimized. Trav-SHACL reorders the shapes in a shape schema for efficient validation and rewrites target and constraint queries for fast detection of invalid entities. The shape schema is validated against an RDF graph accessible via a SPARQL endpoint.

How to run Trav-SHACL?

If you are looking for examples or want to reproduce the results reported in our WWW '21 paper, checkout the eval-www2021 branch.

Note: The current version of Trav-SHACL does not produce a validation report that complies with the SHACL specification. We will add this feature in the future.

Prerequisites

The following guides assume:

  • Your shape schema is placed in ./shapes and is specified in JSON (see the eval-www2021 branch for an example)
  • There is a SPARQL endpoint running that you can connect to, in this example it is http://localhost:14000/sparql
    • The endpoint is running in Docker
    • It is connected to the Docker network semantic-web
    • Its name is endpoint1
    • The port 8890 of the Docker container is mapped to port 14000 of the host

Parameters

  • -d schemaDir (necessary) - path to the directory containing the shape files
  • endpoint (necessary) - URL of the endpoint the shape schema will be validated against
  • graphTraversal (necessary) - defines the graph traversal algorithm to be used, is one of [BFS, DFS]
  • outputDir (necessary) - directory to be used for storing the result files, has to end on /
  • --heuristics (necessary) - used to determine the seed shape
    • TARGET if shapes with a target definition should be prioritized, otherwise omit
    • prioritize in- or outdegree of shapes, one of [IN, OUT] or to be omitted
    • prioritize shapes based on their number of constraints, one of [BIG, SMALL] or to be omitted
  • --selective (optional) - use more selective queries for constraint queries
  • --outputs (optional) - creates one file each for violated and validated targets, otherwise only statistics and traces will be stored
  • -m (optional) - maximum number of entities in FILTER or VALUES clause of a SPARQL query, default: 256
  • -j / --json (optional) - indicates that the SHACL shape schema is expressed in JSON

Run with Docker

In order to connect to the SPARQL endpoint, it must be accessible from within the Docker container. There shouldn't be anything to configure if you use a public endpoint like DBpedia or Wikidata. However, if you run your own dockerized SPARQL endpoints, make sure that the endpoint and the Trav-SHACL container are connected to the same Docker network, in this example it is called semantic-web.

# Preparation
docker build -t travshacl .
docker run --name trav-shacl -v $(pwd)/shapes:/shapes -v $(pwd)/results:/results --network=semantic-web -d travshacl

# Run the Validation
docker exec -it trav-shacl bash -c "python3 main.py -d /shapes http://endpoint1:8890/sparql /results/ DFS --heuristics TARGET IN BIG --orderby --selective --outputs --json"

Run with Python3

pip3 install -r requirements.txt
python3 main.py -d ./shapes http://localhost:14000/sparql ./results/ DFS --heuristics TARGET IN BIG --orderby --selective --outputs --json

How to run the Test Suite?

In order to run the test suite, you need to install the production and development dependencies.

pip3 install -r requirements.txt -r requirements-dev.txt

Afterwards, start the Docker container with the test data.

docker-compose -f tests/docker-compose.yml up -d

Finally, you can run the tests by executing the following command.

pytest

Publications

  1. Mónica Figuera, Philipp D. Rohde, Maria-Esther Vidal. Trav-SHACL: Efficiently Validating Networks of SHACL Constraints. In Proceedings of the Web Conference 2021 (WWW '21), April 19-23, 2021, Ljubljana, Slovenia. https://doi.org/10.1145/3442381.3449877, Experiment Scripts, Preprint

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

TravSHACL-1.1.1.tar.gz (43.4 kB view details)

Uploaded Source

Built Distribution

TravSHACL-1.1.1-py3-none-any.whl (49.1 kB view details)

Uploaded Python 3

File details

Details for the file TravSHACL-1.1.1.tar.gz.

File metadata

  • Download URL: TravSHACL-1.1.1.tar.gz
  • Upload date:
  • Size: 43.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for TravSHACL-1.1.1.tar.gz
Algorithm Hash digest
SHA256 9dbdb7099d9a8ef9752ce11a6436f1abaffcd112c153274e2aeb4286a45f2e1f
MD5 be6eb4575f37532a19b060954df9e02d
BLAKE2b-256 9f9be2f593ba84c2a0ede2cc50cf4afd9351c27743c7a89a5d1f028832f03157

See more details on using hashes here.

File details

Details for the file TravSHACL-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: TravSHACL-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 49.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for TravSHACL-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b2689e284918009d5485958c2ef0a4ca845c032b3a65a9eff6a98679109381ba
MD5 b0f2cbef8e2d0721263855a6b7520f70
BLAKE2b-256 13766981396c10167b590e78072670bf58b1ef379bfd77e155ce8c411d25e4b6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page