Skip to main content

Linked data class python package

Project description

DOI PyPI-Server Coveralls Project generated with PyScaffold

oold-python

Linked data class python package for object oriented linked data (OO-LD) based on pydantic. This package aims to implemment this functionality independent from the osw-python package - work in progress.

Installation

pip install oold

Objectives

  • lossless transpilation between OO-LD schemas and extended pydantic data classes
  • interprete string IRIs with oold-range annotation as typed class property
  • dynamically resolve such IRIs from one or multiple backends (simple in-memory dict, RDF-Graph, SPARQL-Endpoint, Document Store, etc.)
  • serialized class instances to JSON-LD while replacing python object-references with IRIs
  • apply filters / queries to backend-requests (SPARQL, GraphQL, ...)

Related Work

Lib Name Repo Description
RDFLib https://github.com/RDFLib/rdflib Widely used for managing RDF data; lacks built-in schema validation or type safety, requires external reasoning tools. Provides local/remote SPARQL support (used as backend for oold-python).
SuRF https://github.com/cosminbasca/surfrdf ORM-like approach for RDF; Dynamically generated class definitions, no static type checking.
Owlready2 https://github.com/pwin/owlready2 Provides Python classes aligned with OWL and includes native reasoning (HermiT/Pellet). Limited runtime type validation; no direct remote SPARQL endpoint support.
twa https://github.com/TheWorldAvatar/baselib/tree/main/python_wrapper Pydantic-based OGM with built-in schema validation/type safety; Strong coupling of RDF-Properties and type annotations.
COLD https://github.com/DigiBatt/cold/ Generates static python classes from OWL classes to offer RDF generation. No object-to-graph mapping

see also Bai et al. https://doi.org/10.1039/D5DD00069F

Features

Code Generation

Generate Python data models from OO-LD Schemas (based on datamodel-code-generator):

from oold.generator import Generator
import importlib
import datamodel_code_generator
import oold.model.model as model

schemas = [
    {   # minimal example
        "id": "Foo",
        "title": "Foo",
        "type": "object",
        "properties": {
            "id": {"type": "string"},
        },
    },
]
g = Generator()
g.generate(schemas, main_schema="Foo.json", output_model_type=datamodel_code_generator.DataModelType.PydanticBaseModel)
importlib.reload(model)

# Now you can work with your generated model
f = model.Foo(id="ex:f")
print(f)

This example uses the built-in Generator to create a basic Pydantic model (v1 or v2) from JSON schemas.

More details see example code

Object Graph Mapping

Concept

Illustrative example how the object orient linked data (OO-LD) package provides an abstract knowledge graph (KG) interface. First (line 3) primary schemas (Foo) and their dependencies (Bar, Baz) are loaded from the KG and transformed into python dataclasses. Instantiation of foo is handled by loading the respective JSON(-LD) document from the KG and utilizing the type relation to the corresponding schema and dataclass (line 5). Because bar is not a dependent subobject of foo it is loaded on-demand on first access of the corresponding class attribute of foo (foo.bar in line 7), while id as dependent literal is loaded immediately in the same operation. In line 9 baz is constructed by an existing controller class subclassing Foo and finally stored as a new entity in the KG in line 11.

Represent your domain objects easily and reference them via IRIs or direct object instances. For instance, if you have a Foo model referencing a Bar model:

import oold.model.model as model

# Create a Foo object linked to Bar
f = model.Foo(
    id="ex:f",
    literal="test1",
    b=model.Bar(id="ex:b", prop1="test2"),
    b2=[model.Bar(id="ex:b1", prop1="test3"), model.Bar(id="ex:b2", prop1="test4")],
)

print(f.b.id)          # ex:b
print(f.b2[0].prop1)   # test3

You can also refer to objects by IRI:

# Assign IRI strings directly
f = model.Foo(
    id="ex:f",
    literal="test1",
    b="ex:b",  # automatically resolved to a Bar object
    b2=["ex:b1", "ex:b2"],
)

Thanks to the resolver mechanism, these IRIs turn into fully-fledged objects as soon as you need them.

More details see example code

RDF-Export

Easily convert your objects to RDF (JSON-LD) and integrate with SPARQL queries:

from rdflib import Graph
from typing import List, Optional

# Example: Convert Person objects to RDF
p1 = model.Person(name="Alice")
p2 = model.Person(name="Bob", knows=[p1])

# Export to JSON-LD
print(p2.to_jsonld())

# Load into RDFlib
g = Graph()
g.parse(data=p1.to_jsonld(), format="json-ld")
g.parse(data=p2.to_jsonld(), format="json-ld")

# Perform SPARQL queries
qres = g.query("""
    SELECT ?name
    WHERE {
        ?s <https://schema.org/knows> ?o .
        ?o <https://schema.org/name> ?name .
    }
""")
for row in qres:
    print("Bob knows", row.name)

The extended dataclass notation includes semantic annotations as JSON-LD context, giving you powerful tooling for knowledge graphs, semantic queries, and data interoperability.

More details see example code

BaseController

Base mixin for controllers that extend LinkedBaseModel data classes. Controllers add runtime behavior (connections, archiving, state) without polluting the data model.

from oold.model import BaseController, LinkedBaseModel

class Robot(LinkedBaseModel):
    name: str
    joint_count: int = 6
    connection_url: str = ""

class RobotController(BaseController, Robot):
    _connected: bool = False

    def connect(self):
        self._connected = True
        print(f"Connected to {self.connection_url}")

    def move(self, joint: int, angle: float):
        if not self._connected:
            raise RuntimeError("Not connected")
        print(f"Moving joint {joint} to {angle} deg")

ctrl = RobotController(name="arm-1", connection_url="tcp://192.168.1.10:5000")
ctrl.connect()
ctrl.move(1, 45.0)

# Serialization includes model fields, strips controller state
ctrl.to_json()  # {"name": "arm-1", "joint_count": 6, "connection_url": "tcp://..."}

Key features:

  • Auto-detects the pure data model from MRO - no manual configuration needed
  • Serialization strips controller fields - to_json() / to_jsonld() only include data model fields
  • Type registry exclusion - controllers don't replace their data model in the _types lookup, so backend resolution always returns the pure model class
  • Multi-model support - Controller(ModelA, ModelB) merges type arrays from both models

cast()

Convert between model classes, preserving __iris__ references:

target = source.cast(TargetClass, remove_extra=True, none_to_default=True)

# Or construct directly from another model instance:
target = TargetClass(source, extra_field="value")

Parameters:

  • none_to_default - drop None/empty list attributes so the target uses its defaults
  • remove_extra - drop fields not defined on the target class
  • silent - suppress warnings about dropped fields (default: True)

Backends

Built-in backends for entity persistence and resolution:

Backend Storage Query support
SimpleDictDocumentStore In-memory dict, optional JSON file (file_path) Filter by field
SqliteDocumentStore SQLite database (default format=JSON) -
LocalSparqlBackend In-memory RDF graph (rdflib) SPARQL
from oold.backend.document_store import SimpleDictDocumentStore
from oold.backend.interface import StoreParam, SetResolverParam, set_resolver

store = SimpleDictDocumentStore(file_path="./entities.json")
set_resolver(SetResolverParam(iri="ex", resolver=store))

# Store and resolve entities
store.store(StoreParam(nodes={"ex:foo": foo}))
loaded = MyModel["ex:foo"]  # resolves via registered backend

Custom backends implement the Backend interface (resolve_iris, store_json_dicts).

Dev

git clone https://github.com/OpenSemanticWorld/oold-python
pip install -e .[dev]

Run tests

tox -e test

Benchmarking

tox -e benchmark

Compare to previous benchmark run without storing new results:

tox -e benchmark-compare

Contribute

We welcome contributions! Please fork the repository and submit a pull request with your changes. Please enable pre-commit hooks in your fork to ensure code quality.

pre-commit install

Please enable GitHub Actions for your fork to run the tests automatically.

Note

This project has been set up using PyScaffold 4.5. For details and usage information on PyScaffold see https://pyscaffold.org/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oold-0.16.0.tar.gz (699.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oold-0.16.0-py3-none-any.whl (187.6 kB view details)

Uploaded Python 3

File details

Details for the file oold-0.16.0.tar.gz.

File metadata

  • Download URL: oold-0.16.0.tar.gz
  • Upload date:
  • Size: 699.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for oold-0.16.0.tar.gz
Algorithm Hash digest
SHA256 333dd0d3ddf029062c0f02b4c8872c3d5f35a14ad1017b06bd05e09bb58864f9
MD5 32f3c06b06fb1219aaf1f567803fd739
BLAKE2b-256 b6c1c31aea1b92ba5140265ed8606cb879008ff5f40f687d9d3efcba7092acc3

See more details on using hashes here.

File details

Details for the file oold-0.16.0-py3-none-any.whl.

File metadata

  • Download URL: oold-0.16.0-py3-none-any.whl
  • Upload date:
  • Size: 187.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for oold-0.16.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4aa6df675b897de0268446776572f32aea87b70937c9566ba0e92b54f7bfec36
MD5 e10f416042307aa21ebc7498b48d5c92
BLAKE2b-256 d91341ac65e5fe05c463fbbce1869b138f0823734a46b93009f5f29537c15ad1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page