Skip to main content

Python ShEx interpreter

Project description

This repository was originally developed by Harold Solbrig and was kindly contributed to the LinkML organization because of his retirement. All credit for the original development of this repository goes to him.

Special note

Since development was taken over after a long time of no development, there are tests that are not passing. The reasons are not always immediately clear, in some cases it is due to code not having been updated when dependencies updated their versions. In other cases, there are genuine bugs that have not been fixed. The current release was released in order to be able to support Python 3.14 in LinkML. Please get in touch by opening an issue if you encounter any bugs.

Python implementation of ShEx 2.0

Pyversions

PyPi

DOI

CodeCov

https://mybinder.org/v2/gh/hsolbrig/pyshex/master

This package is a reasonably literal implementation of the Shape Expressions Language 2.0. It can parse and "execute" ShExC and ShExJ source.

Revisions

  • 0.2.dev3 -- added SchemaEvaluator and other tweaks. There are still some unit tests that fail -- beware
  • 0.3.0 -- Fix several issues. Still does not pass all unit tests -- see test_manifest.py for details
  • 0.4.0 -- Added sparql_slurper capabilities.
  • 0.4.1 -- Resolves several issues with reactome and disease test cases
  • 0.4.2 -- Fix issues #13 (missing start) and #14 (Inconsistent shape causes loop)
  • 0.4.3 -- Fix issues #16 and #15 and some refactoring
  • 0.5.0 -- First cut at returning fail reasons... some work still needed
  • 0.5.1 -- Update shexc parser to include multi-line comments and bug fixes
  • 0.5.2 -- Issue with installer - missed the parse_tree package
  • 0.5.3 -- make sparql_slurper a dependency
  • 0.5.4 -- Fixed long recursion issue with blood pressure example
  • 0.5.5 -- Fixed zero cardinality issue (#20)
  • 0.5.6 -- Added CLI entry point and cleaned up error reporting
  • 0.5.7 -- Throw an error on an invalid focus node (#23)
  • 0.5.9 -- Candidate for ShEx 2.1
  • 0.5.10 -- Fixed evaluator to load files, strings, etc. as ShEx
  • 0.5.11 -- Added Collections Flattening graph option to evaluator.
  • 0.5.12 -- Added -A option, catch missing start node early
  • 0.6.0 -- Added the -ut and -sp options to allow start nodes to be specified by rdf:type or an arbitrary predicate
  • 0.6.1 -- Added the ability to supply a SPARQL Query (-sq option)
  • 0.7.0 -- Fixes for issues 28, 29 and 30
  • 0.7.1 -- Fix issue 26
  • 0.7.2 -- Upgrade error reporting
  • 0.7.3 -- Report using namespaces, enhance PrefixLib to inject into a module
  • 0.7.4 -- Added '-ps', '-pr', '-gn', '-pb' options to CLI
  • 0.7.5 -- Fix CLOSED issue in evaluate call (issue 41)
  • 0.7.6 -- bump version due to build error

Installation

pip install PyShEx

Note: If you need to escape single quotes in RDF literals, you will need to install the bleeding edge of rdflib:

pip uninstall rdflib
pip install git+https://github.com/rdflib/rdflib

Unfortunately, however, rdflib-jsonld is NOT compatible with the bleeding edge rdflib, so you can't use a json-ld parser in this situation.

shexeval CLI

> shexeval -h
usage: shexeval [-h] [-f FORMAT] [-s START] [-ut] [-sp STARTPREDICATE]
                [-fn FOCUS] [-A] [-d] [-ss] [-cf] [-sq SPARQL] [-se]
                [--stopafter STOPAFTER] [-ps] [-pr] [-gn GRAPHNAME] [-pb]
                rdf shex

positional arguments:
  rdf                   Input RDF file or SPARQL endpoint if slurper or sparql
                        options
  shex                  ShEx specification

optional arguments:
  -h, --help            show this help message and exit
  -f FORMAT, --format FORMAT
                        Input RDF Format
  -s START, --start START
                        Start shape. If absent use ShEx start node.
  -ut, --usetype        Start shape is rdf:type of focus
  -sp STARTPREDICATE, --startpredicate STARTPREDICATE
                        Start shape is object of this predicate
  -fn FOCUS, --focus FOCUS
                        RDF focus node
  -A, --allsubjects     Evaluate all non-bnode subjects in the graph
  -d, --debug           Add debug output
  -ss, --slurper        Use SPARQL slurper graph
  -cf, --flattener      Use RDF Collections flattener graph
  -sq SPARQL, --sparql SPARQL
                        SPARQL query to generate focus nodes
  -se, --stoponerror    Stop on an error
  --stopafter STOPAFTER
                        Stop after N nodes
  -ps, --printsparql    Print SPARQL queries as they are executed
  -pr, --printsparqlresults
                        Print SPARQL query and results
  -gn GRAPHNAME, --graphname GRAPHNAME
                        Specific SPARQL graph to query - use '' for any named
                        graph
  -pb, --persistbnodes  Treat BNodes as persistent in SPARQL endpoint

Documentation

See: examples Jupyter notebooks for sample uses

General Layout

The root pyshex package is subdivided into:

The ShEx schema definitions for this package come from ShExJSG

We are trying to keep the python as close as possible to the (semi-)formal specification. As an example, the statement:

Se is a ShapeAnd and for every shape expression se2 in shapeExprs, satisfies(n, se2, G, m)

is implemented in Python as:

        ...
if isinstance(se, ShExJ.ShapeAnd):
    return satisfiesShapeAnd(cntxt, n, se)
        ...
def satisfiesShapeAnd(cntxt: Context, n: nodeSelector, se: ShExJ.ShapeAnd) -> bool:
    return all(satisfies(cntxt, n, se2) for se2 in se.shapeExprs)

Dependencies

This package is built using:

Conformance

This implementation passes all of the tests in the master branch of validation/manifest.ttl with the following exceptions:

At the moment, there are 1088 tests, of which:

  • 1007 pass
  • 81 are skipped - reasons:
  1. (52) sht:LexicalBNode, sht:ToldBNode and sht:BNodeShapeLabel test non-blank blank nodes (rdflib does not preserve bnode "identity")
  2. (18) sht:Import Uses ShEx 2.1 IMPORT feature -- not yet implemented (three aren't tagged)
  3. (3) Uses manifest shapemap feature -- not yet implemented
  4. (2) sht:relativeIRI -- this isn't a real problem, but we havent taken time to deal with this in the test harness
  5. (6) rdflib has a parsing error when escaping single quotes. (Issue submitted, awaiting release)

As mentioned above, at the moment this is as literal an implementation of the specification as was sensible. This means, in particular, that we are less than clever when it comes to partition management.

Docker

Build

docker build -t pyshex docker

Run

docker run --rm -it pyshex -gn '' -ss -ut -pr -sq 'select distinct ?item where{?item a <http://w3id.org/biolink/vocab/Gene>} LIMIT 1' http://graphdb.dumontierlab.com/repositories/ncats-red-kg https://github.com/biolink/biolink-model/raw/master/shex/biolink-modelnc.shex

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyshex-0.9.0.tar.gz (509.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyshex-0.9.0-py3-none-any.whl (54.7 kB view details)

Uploaded Python 3

File details

Details for the file pyshex-0.9.0.tar.gz.

File metadata

  • Download URL: pyshex-0.9.0.tar.gz
  • Upload date:
  • Size: 509.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pyshex-0.9.0.tar.gz
Algorithm Hash digest
SHA256 87288b5e5613f734f55f0085334558218ff618fb1061aabdcee19841092b3eca
MD5 e0924d04f8864520c621dd1c51ce8339
BLAKE2b-256 85caf0e6ecd16e65318f69fe5937982955c340a9e5828dcb391371100577c174

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyshex-0.9.0.tar.gz:

Publisher: pypi-publish.yaml on linkml/PyShEx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyshex-0.9.0-py3-none-any.whl.

File metadata

  • Download URL: pyshex-0.9.0-py3-none-any.whl
  • Upload date:
  • Size: 54.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pyshex-0.9.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d81344deed686b7c169f23156221ae281225e2ba02b14fe9810335afdefffa9d
MD5 b3988a197fc1a444210432df33ffdac0
BLAKE2b-256 40ea66c21d1f5fec82e6218a70b5672870f76878f41bf3b9570235b4e7223118

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyshex-0.9.0-py3-none-any.whl:

Publisher: pypi-publish.yaml on linkml/PyShEx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page