Skip to main content

Convert ArangoDB graphs to RDF & vice-versa.

Project description

ArangoRDF

build CodeQL Coverage Status Last commit

PyPI version badge Python versions badge

License Code style: black Downloads

Convert RDF Graphs to ArangoDB, and vice-versa.

About RDF

RDF is a standard model for data interchange on the Web. RDF has features that facilitate data merging even if the underlying schemas differ, and it specifically supports the evolution of schemas over time without requiring all the data consumers to be changed.

RDF extends the linking structure of the Web to use URIs to name the relationship between things as well as the two ends of the link (this is usually referred to as a "triple"). Using this simple model, it allows structured and semi-structured data to be mixed, exposed, and shared across different applications.

This linking structure forms a directed, labeled graph, where the edges represent the named link between two resources, represented by the graph nodes. This graph view is the easiest possible mental model for RDF and is often used in easy-to-understand visual explanations.

Resources to get started:

Installation

Latest Release

pip install arango-rdf

Current State

pip install git+https://github.com/ArangoDB-Community/ArangoRDF

Quickstart

Open In Colab

from rdflib import Graph
from arango import ArangoClient
from arango_rdf import ArangoRDF

db = ArangoClient().db()

adbrdf = ArangoRDF(db)

def beatles():
    g = Graph()
    g.parse("https://raw.githubusercontent.com/ArangoDB-Community/ArangoRDF/main/tests/data/rdf/beatles.ttl", format="ttl")
    return g

RDF to ArangoDB

Note: RDF-to-ArangoDB functionality has been implemented using concepts described in the paper Transforming RDF-star to Property Graphs: A Preliminary Analysis of Transformation Approaches. So we offer two transformation approaches:

  1. RDF-Topology Preserving Transformation (RPT)
  2. Property Graph Transformation (PGT)
# 1. RDF-Topology Preserving Transformation (RPT)
adbrdf.rdf_to_arangodb_by_rpt(name="BeatlesRPT", rdf_graph=beatles(), overwrite_graph=True)

# 2. Property Graph Transformation (PGT) 
adbrdf.rdf_to_arangodb_by_pgt(name="BeatlesPGT", rdf_graph=beatles(), overwrite_graph=True)

RPT preserves the RDF Graph structure by transforming each RDF Statement into an ArangoDB Edge.

PGT on the other hand ensures that Datatype Property Statements are mapped as ArangoDB Document Properties.

@prefix ex: <http://example.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
ex:book ex:publish_date "1963-03-22"^^xsd:date .
ex:book ex:pages "100"^^xsd:integer .
ex:book ex:cover 20 .
ex:book ex:index 55 .
RPT PGT
image image

We also offer a third transformation approach:

  1. Labeled Property Graph Transformation (LPG)

This is useful when you want to combine the benefits of RPT and PGT:

  1. Uses 1 ArangoDB Collection for all RDF Resources
  2. Uses 1 ArangoDB Collection for all RDF Statements
  3. Stores literal statements as ArangoDB Document Properties
adbrdf.rdf_to_arangodb_by_lpg(name="BeatlesLPG", rdf_graph=beatles(), overwrite_graph=True)

# Apply RDF type statements as ArangoDB Document Attributes
adbrdf.migrate_edges_to_attributes(
    "BeatlesLPG", "Edge", "_type", filter_clause="e._label == 'type'"
)

ArangoDB to RDF

# Assumption: "BeatlesPGT" loaded in ArangoDB ^

# 1. Graph to RDF
rdf_graph = adbrdf.arangodb_graph_to_rdf("BeatlesPGT", rdf_graph=Graph())

# 2. Collections to RDF
rdf_graph_2 = adbrdf.arangodb_collections_to_rdf(
    "BeatlesPGT",
    rdf_graph=Graph(),
    v_cols={"Album", "Band"},
    e_cols={"artist"},
)

# 3. Metagraph to RDF
rdf_graph_3 = adbrdf.arangodb_to_rdf(
    name=name,
    rdf_graph=Graph(),
    metagraph={
        "vertexCollections": {
            "Album": {"name", "date"},
            "Band": {"name"}
        },
        "edgeCollections": {
            "artist": {}
        },
    },
)

Development & Testing

  1. git clone https://github.com/ArangoDB-Community/ArangoRDF
  2. cd arango-rdf
  3. (create virtual environment of choice)
  4. pip install -e .[dev]
  5. (create an ArangoDB instance with method of choice)
  6. pytest --url <> --dbName <> --username <> --password <>

Note: A pytest parameter can be omitted if the endpoint is using its default value:

def pytest_addoption(parser):
    parser.addoption("--url", action="store", default="http://localhost:8529")
    parser.addoption("--dbName", action="store", default="_system")
    parser.addoption("--username", action="store", default="root")
    parser.addoption("--password", action="store", default="")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arango_rdf-2.0.0.tar.gz (788.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arango_rdf-2.0.0-py3-none-any.whl (55.9 kB view details)

Uploaded Python 3

File details

Details for the file arango_rdf-2.0.0.tar.gz.

File metadata

  • Download URL: arango_rdf-2.0.0.tar.gz
  • Upload date:
  • Size: 788.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for arango_rdf-2.0.0.tar.gz
Algorithm Hash digest
SHA256 149eca9dde4f2fcb4faa50a5d1aec199accc1c3164e00b34939721c9831a085e
MD5 515c1dd1a718abbff74ea62d49697175
BLAKE2b-256 f16c0421399309ddc65edda3786b753238cd80c23def9f2eec848eacaa7659fd

See more details on using hashes here.

File details

Details for the file arango_rdf-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: arango_rdf-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 55.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for arango_rdf-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8fcbd1b64d86590ae3fe84fa3b196f70d363b76f3375c5e51a322714396cd48b
MD5 037ebde5f5a22d448433eaaa2d955554
BLAKE2b-256 46c2866a0cdf5d9e3dbe05024ed1f58e68e5fa7491a4aa9af1f081f99f5f053d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page