Skip to main content

Ordered Turtle Serializer for rdflib

Project description

Build status Code health Latest version MIT license

Ordered Turtle Serializer for rdflib

An extension to the rdflib Turtle serializer that adds order (at the price of speed). Useful when you need to generate diffs between Turtle files, or just to make it easier for human beings to inspect the files.

$ pip install otsrdflib

Usage:

from rdflib import graph
from otsrdflib import OrderedTurtleSerializer

my_graph = Graph()

out = open('out.ttl', 'wb')
serializer = OrderedTurtleSerializer(my_graph)
serializer.serialize(out)

Class order

By default, classes are ordered alphabetically by their URIS.

A custom order can be imposed by adding classes to the class_order attribute. For a SKOS vocabulary, for instance, you might want to sort the concept scheme first, followed by the other elements of the vocabulary:

serializer.class_order = [
    SKOS.ConceptScheme,
    SKOS.Concept,
    ISOTHES.ThesaurusArray,
]

Any class not included in the class_order list will be sorted alphabetically at the end, after the classes included in the list.

Instance order

By default, instances of a class are ordered alphabetically by their URIS.

A custom order can be imposed by defining functions that generate sort keys from the URIs. For instance, you could define a function that returns the numeric last part of an URI to be sorted numerically:

serializer.sorters = [
    ('.*?/[^0-9]*([0-9.]+)$', lambda x: float(x[0])),
]

The first element of the tuple (‘.*?/[^0-9]*([0-9.]+)$’) is the regexp pattern to be matched against the URIs, while the second element (lambda x: float(x[0])) is the sort key generating function. In this case, it returns the first backreference as a float.

The patterns in sorters will be attempted matched against instances of any class. You can also define patterns that will only be matched against instances of a specific class. Let’s say you only wanted to sort instances of SKOS.Concept this way:

from rdflib.namespace import SKOS

serializer.sorters_by_class = {
    SKOS.Concept: [
        ('.*?/[^0-9]*([0-9.]+)$', lambda x: float(x[0])),
    ]
}

For a slightly more complicated example, let’s look at Dewey. Classes in the main schedules are describes by URIs like http://dewey.info/class/001.433/e23/, and we will use the class number (001.433) for sorting. But there’s also table classes like http://dewey.info/class/1–0901/e23/. We want to sort these at the end, after the main schedules. To achieve this, we define two sorters, one that matches the table classes and one that matches the main schedule classes:

serializer.sorters = [
    ('/([0-9A-Z\-]+)\-\-([0-9.\-;:]+)/e', lambda x: 'T{}--{}'.format(x[0], x[1])),  # table numbers
    ('/([0-9.\-;:]+)/e', lambda x: 'A' + x[0]),  # main schedule numbers
]

By prefixing the table numbers with ‘T’ and the main schedule numbers with ‘A’, we ensure the table numbers are sorted after the main schedule numbers.

Changes in version 0.5

  • The topClasses attribute was renamed to class_order to better reflect its content and comply with PEP8. It was also changed to be empty by default, since the previous default list was rather random.

  • A sorters_by_class attribute was added to allow sorters to be defined per class.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

otsrdflib-0.5.0.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

otsrdflib-0.5.0-py2.py3-none-any.whl (6.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file otsrdflib-0.5.0.tar.gz.

File metadata

  • Download URL: otsrdflib-0.5.0.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for otsrdflib-0.5.0.tar.gz
Algorithm Hash digest
SHA256 937f02cd655487c0632aa1dcab6f7aabd956fa9c3415ec30bb90726a8011d626
MD5 e3eb7b0ae3d0c813f4643a7a1e616a18
BLAKE2b-256 151fc8b55fcb049714e4fe443c9c6a280b5915ff63bc347a142e304bf641b714

See more details on using hashes here.

File details

Details for the file otsrdflib-0.5.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for otsrdflib-0.5.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 c05605d093b0263c5500e852f81d185ce14dabbcbc323a600c6ca24bd7b7bfca
MD5 57fde7f25f28cac729e3bba59e810912
BLAKE2b-256 23d3c9e172abeb9a043154f44cffaadc079bea37fb604cf59beee536e543a61e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page