Skip to main content

Linked data class python package

Project description

DOI PyPI-Server Coveralls Project generated with PyScaffold

oold-python

Linked data class python package for object oriented linked data (OO-LD) based on pydantic. This package aims to implemment this functionality independent from the osw-python package - work in progress.

Installation

pip install oold

Objectives

  • lossless transpilation between OO-LD schemas and extended pydantic data classes
  • interprete string IRIs with oold-range annotation as typed class property
  • dynamically resolve such IRIs from one or multiple backends (simple in-memory dict, RDF-Graph, SPARQL-Endpoint, Document Store, etc.)
  • serialized class instances to JSON-LD while replacing python object-references with IRIs
  • apply filters / queries to backend-requests (SPARQL, GraphQL, ...)

Related Work

Lib Name Repo Description
RDFLib https://github.com/RDFLib/rdflib Widely used for managing RDF data; lacks built-in schema validation or type safety, requires external reasoning tools. Provides local/remote SPARQL support (used as backend for oold-python).
SuRF https://github.com/cosminbasca/surfrdf ORM-like approach for RDF; Dynamically generated class definitions, no static type checking.
Owlready2 https://github.com/pwin/owlready2 Provides Python classes aligned with OWL and includes native reasoning (HermiT/Pellet). Limited runtime type validation; no direct remote SPARQL endpoint support.
twa https://github.com/TheWorldAvatar/baselib/tree/main/python_wrapper Pydantic-based OGM with built-in schema validation/type safety; Strong coupling of RDF-Properties and type annotations.
COLD https://github.com/DigiBatt/cold/ Generates static python classes from OWL classes to offer RDF generation. No object-to-graph mapping

see also Bai et al. https://doi.org/10.1039/D5DD00069F

Features

Code Generation

Generate Python data models from OO-LD Schemas (based on datamodel-code-generator):

from oold.generator import Generator
import importlib
import datamodel_code_generator
import oold.model.model as model

schemas = [
    {   # minimal example
        "id": "Foo",
        "title": "Foo",
        "type": "object",
        "properties": {
            "id": {"type": "string"},
        },
    },
]
g = Generator()
g.generate(schemas, main_schema="Foo.json", output_model_type=datamodel_code_generator.DataModelType.PydanticBaseModel)
importlib.reload(model)

# Now you can work with your generated model
f = model.Foo(id="ex:f")
print(f)

This example uses the built-in Generator to create a basic Pydantic model (v1 or v2) from JSON schemas.

More details see example code

Object Graph Mapping

Concept

Illustrative example how the object orient linked data (OO-LD) package provides an abstract knowledge graph (KG) interface. First (line 3) primary schemas (Foo) and their dependencies (Bar, Baz) are loaded from the KG and transformed into python dataclasses. Instantiation of foo is handled by loading the respective JSON(-LD) document from the KG and utilizing the type relation to the corresponding schema and dataclass (line 5). Because bar is not a dependent subobject of foo it is loaded on-demand on first access of the corresponding class attribute of foo (foo.bar in line 7), while id as dependent literal is loaded immediately in the same operation. In line 9 baz is constructed by an existing controller class subclassing Foo and finally stored as a new entity in the KG in line 11.

Represent your domain objects easily and reference them via IRIs or direct object instances. For instance, if you have a Foo model referencing a Bar model:

import oold.model.model as model

# Create a Foo object linked to Bar
f = model.Foo(
    id="ex:f",
    literal="test1",
    b=model.Bar(id="ex:b", prop1="test2"),
    b2=[model.Bar(id="ex:b1", prop1="test3"), model.Bar(id="ex:b2", prop1="test4")],
)

print(f.b.id)          # ex:b
print(f.b2[0].prop1)   # test3

You can also refer to objects by IRI:

# Assign IRI strings directly
f = model.Foo(
    id="ex:f",
    literal="test1",
    b="ex:b",  # automatically resolved to a Bar object
    b2=["ex:b1", "ex:b2"],
)

Thanks to the resolver mechanism, these IRIs turn into fully-fledged objects as soon as you need them.

More details see example code

RDF-Export

Easily convert your objects to RDF (JSON-LD) and integrate with SPARQL queries:

from rdflib import Graph
from typing import List, Optional

# Example: Convert Person objects to RDF
p1 = model.Person(name="Alice")
p2 = model.Person(name="Bob", knows=[p1])

# Export to JSON-LD
print(p2.to_jsonld())

# Load into RDFlib
g = Graph()
g.parse(data=p1.to_jsonld(), format="json-ld")
g.parse(data=p2.to_jsonld(), format="json-ld")

# Perform SPARQL queries
qres = g.query("""
    SELECT ?name
    WHERE {
        ?s <https://schema.org/knows> ?o .
        ?o <https://schema.org/name> ?name .
    }
""")
for row in qres:
    print("Bob knows", row.name)

The extended dataclass notation includes semantic annotations as JSON-LD context, giving you powerful tooling for knowledge graphs, semantic queries, and data interoperability.

More details see example code

BaseController

Base mixin for controllers that extend LinkedBaseModel data classes. Controllers add runtime behavior (connections, archiving, state) without polluting the data model.

from oold.model import BaseController, LinkedBaseModel

class Robot(LinkedBaseModel):
    name: str
    joint_count: int = 6
    connection_url: str = ""

class RobotController(BaseController, Robot):
    _connected: bool = False

    def connect(self):
        self._connected = True
        print(f"Connected to {self.connection_url}")

    def move(self, joint: int, angle: float):
        if not self._connected:
            raise RuntimeError("Not connected")
        print(f"Moving joint {joint} to {angle} deg")

ctrl = RobotController(name="arm-1", connection_url="tcp://192.168.1.10:5000")
ctrl.connect()
ctrl.move(1, 45.0)

# Serialization includes model fields, strips controller state
ctrl.to_json()  # {"name": "arm-1", "joint_count": 6, "connection_url": "tcp://..."}

Key features:

  • Auto-detects the pure data model from MRO - no manual configuration needed
  • Serialization strips controller fields - to_json() / to_jsonld() only include data model fields
  • Type registry exclusion - controllers don't replace their data model in the _types lookup, so backend resolution always returns the pure model class
  • Multi-model support - Controller(ModelA, ModelB) merges type arrays from both models

cast()

Convert between model classes, preserving __iris__ references:

target = source.cast(TargetClass, remove_extra=True, none_to_default=True)

# Or construct directly from another model instance:
target = TargetClass(source, extra_field="value")

Parameters:

  • none_to_default - drop None/empty list attributes so the target uses its defaults
  • remove_extra - drop fields not defined on the target class
  • silent - suppress warnings about dropped fields (default: True)

Backends

Built-in backends for entity persistence and resolution:

Backend Storage Query support
SimpleDictDocumentStore In-memory dict, optional JSON file (file_path) Filter by field
SqliteDocumentStore SQLite database (default format=JSON) -
LocalSparqlBackend In-memory RDF graph (rdflib) SPARQL
from oold.backend.document_store import SimpleDictDocumentStore
from oold.backend.interface import StoreParam, SetResolverParam, set_resolver

store = SimpleDictDocumentStore(file_path="./entities.json")
set_resolver(SetResolverParam(iri="ex", resolver=store))

# Store and resolve entities
store.store(StoreParam(nodes={"ex:foo": foo}))
loaded = MyModel["ex:foo"]  # resolves via registered backend

Custom backends implement the Backend interface (resolve_iris, store_json_dicts).

Dev

git clone https://github.com/OpenSemanticWorld/oold-python
pip install -e .[dev]

Run tests

tox -e test

Benchmarking

tox -e benchmark

Compare to previous benchmark run without storing new results:

tox -e benchmark-compare

Contribute

We welcome contributions! Please fork the repository and submit a pull request with your changes. Please enable pre-commit hooks in your fork to ensure code quality.

pre-commit install

Please enable GitHub Actions for your fork to run the tests automatically.

Note

This project has been set up using PyScaffold 4.5. For details and usage information on PyScaffold see https://pyscaffold.org/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oold-0.16.2.tar.gz (702.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oold-0.16.2-py3-none-any.whl (189.6 kB view details)

Uploaded Python 3

File details

Details for the file oold-0.16.2.tar.gz.

File metadata

  • Download URL: oold-0.16.2.tar.gz
  • Upload date:
  • Size: 702.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for oold-0.16.2.tar.gz
Algorithm Hash digest
SHA256 610192b49c744f5d35f0c9c780019beb2460df55119379e8482d2c484b7bd4e9
MD5 954c54d8ef1dc985fa43e5b2c89f59a7
BLAKE2b-256 445915da5384eab02f8d58cc17be3b820839b0ca76a4dfbdb3225d1cb555f955

See more details on using hashes here.

File details

Details for the file oold-0.16.2-py3-none-any.whl.

File metadata

  • Download URL: oold-0.16.2-py3-none-any.whl
  • Upload date:
  • Size: 189.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for oold-0.16.2-py3-none-any.whl
Algorithm Hash digest
SHA256 32fd178829fba9a990fd1a1bd34c86378a54d5cdf5f554b10b9fdbdc8bf7715d
MD5 3316f9954011204896b2f5ff792619a6
BLAKE2b-256 097eff3368ae3e8efcea19f79ab08c759f6d650c983f8b9b302bfe8d82d3db42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page