Implementation of a subset of R2RML
Project description
Tiny RML
The package tinyrml
is an implementation of a subset of RML/R2RML with some helpful extended features. It is intended to be used as a Python package/library, and accepts Python iterables (of dict
s) as input. It has the following limitations:
- Mappings cannot specify their sources (tables or SQL queries). Data sources are assigned externally when data is mapped.
- None of the join-related features are supported. Only a single data source can be mapped at a time.
The package supports the following extensions to R2RML (note that a special namespace rre:
is reserved for extensions):
- A
dict
key whose value is a Python list is expanded as multiple values/rows. - Object maps accept the property
rre:expandAsList
; if true, the value (which is assumed to be a string) is split (usingre.split
) with commas and semicolons acting as separators, and expanded as multiple values/rows. This makes it possible to (say) have a comma-separated list in your CSV file, read the file usingcsv.DictReader
, and expand the list as separate values. Splitting and expansion happens only ifrr:template
has a value in the object map. - Term maps accept the property
rre:expression
, the value of which is a string containing a Python expression. During the mapping process, this expression is evaluated with dict keys ("column names") as variables in the expression.
Tiny RML was originally part of rdfhelpers
, but is now split off as its own project. It has no dependencies to rdfhelpers
.
Installation
Tiny RML can be installed from PyPI:
pip install tinyrml
Usage
Tiny RML exposes the class Mapper
which is the basic implementation of the mapping functionality. Instances of Mapper
represent individual mappings (i.e., specific mapping definitions). The class constructor takes the following parameters:
mapping
: a graph (anrdflib.Graph
) containing the mapping, or a path to a file which, when parsed, yields the mapping graph. This is a required (positional) parameter, the rest are optional.triples_map_uri=
, when provided (as aURIRef
), identifies the actual triples map to be used. This is useful when the mapping graph contains several mappings.ignore_field_keys=
is a set of names of keys/fields that are ignored when determining the likely candidate for a key in a template. It defaults to an empty set.empty_string_is_none=
, whenTrue
(the default), makes the mapper treat empty strings as missing values.allow_expressions=
, whenTrue
(the default), lets the mapper use Python expressions embedded in the mapping graph.global_bindings=
, when provided, is passed to theeval()
function (as the parameterglobals=
; see Python documentation) when embedded Python expressions are evaluated. If not provided, "global globals" (default global bindings) are used.allow_object_map_classes=
, whenTrue
(the default), lets mappings specifyrr:class
properties for object maps also (the R2RML specification only allows those for subject maps).
The method Mapper.process(self, rows, result_graph=)
invokes a mapper. The parameter rows
is an iterable of dict
s used as the "rows" to be mapped; dictionary keys take the role of column names. If provided, result_graph=
is a graph where results are added; otherwise a new graph is created. Regardless, the result graph is returned.
The package exposes RR
and RRE
as the namespaces for R2RML and the Tiny RML extensions, respectively. By convention, we use the prefixes rr:
and rre:
for these.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.