Skip to main content

semdsl

Project description

Semantic-DSL

For rapid development of semantically-backed Domain-Specific Languages (DSLs),

Installation

pip install semdsl

Usage

To illustrate the usage of semdsl, we will create a simple LinkML schema for part of the Clue board game, in particular for representing a hypothesis about who committed the misdeed, where, and with what.

We will annotate the schema with grammar hints, that can be used to generate the grammar for the DSL.

```python
>>> schema = """
... id: https://example.org/clue
... name: clue
... imports:
...   - https://w3id.org/linkml/types
... classes:
...   ClueHypothesis:
...     attributes:
...       person:   # e.g. Colonel Mustard
...         annotations:
...           grammar.main: "WORD WORD"
...       location: # e.g. Kitchen
...         annotations:
...           grammar.main: "WORD"
...       weapon:   # e.g. Candlestick
...         annotations:
...           grammar.main: "WORD"
...     annotations:
...       grammar.main: >-
...         "<" person "in the" location "with the" weapon ">" 
... """

The idea is to be able to represent hypotheses using strings like <Colonel Mustard in the Kitchen with the Candlestick>.

We can then use the DSLEngine class to load the schema and generate a grammar:

>>> from semdsl import DSLEngine
>>> engine = DSLEngine()
>>> engine.load_schema(schema)
>>> print(engine.lark_serialization)
from lark import Lark
...
class_clue_hypothesis : "<" person "in the" location "with the" weapon ">"
person : WORD WORD
location : WORD
weapon : WORD
...

The default is Lark syntax.

You can then use the generated grammar to parse serializations into pydantic objects that are schema conformant:

>>> obj = engine.parse_as_object('<Colonel Mustard in the Kitchen with the Candlestick>')
>>> print(obj.location)
Kitchen
>>> print(obj.json())
   {"person": "Colonel Mustard", "location": "Kitchen", "weapon": "Candlestick"}

Auto-assigning production rules

In the previous example we saw how we could annotate an existing schema with grammar rules

However, we can also generate grammar rules from the schema itself.

This is done by using the grammar.main annotation on a class, and then using the grammar.auto annotation on the attributes of that class. For example:

>>> schema = """
... id: https://example.org/clue
... name: clue
... imports:
...   - https://w3id.org/linkml/types
... classes:
...   ClueHypothesis:
...     attributes:
...       person:   # e.g. Colonel Mustard
...       location: # e.g. Kitchen
...       weapon:   # e.g. Candlestick
... """

now we will create a new engine and load the schema, and generate a de-novo "functional-style" grammar:

>>> engine = DSLEngine() ## create new DSLEngine
>>> engine.load_schema(schema)
>>> print(engine.lark_serialization)
from lark import Lark
...
class_clue_hypothesis : "ClueHypothesis(" slot_clue_hypothesis__person? slot_clue_hypothesis__location? slot_clue_hypothesis__weapon? ")"
slot_clue_hypothesis__person : "person=" TYPE_STRING
slot_clue_hypothesis__location : "location=" TYPE_STRING
slot_clue_hypothesis__weapon : "weapon=" TYPE_STRING
...

You can then use the generated grammar to parse strings into objects:

>>> obj = engine.parse_as_object('ClueHypothesis(person="Colonel Mustard" location="Kitchen" weapon="Candlestick")')
>>> print(obj.location)
Kitchen

Adding additional semantics

You can use the following metamodel element:

to assign URIs to classes and slots in your schema, which can be used in RDF serialization.

Here we extend our Clue schema, adding classes for the ranges of the slots in the main class:

>>> schema = """
... id: https://example.org/clue
... name: clue
... prefixes:
...   linkml: https://w3id.org/linkml/
...   clue: https://example.org/clue/
...   schema: http://schema.org/
...   prov: http://www.w3.org/ns/prov#
...   dbpedia: http://dbpedia.org/ontology/
... imports:
...   - linkml:types
... classes:
...   NamedThing:
...     class_uri: schema:Thing
...     attributes:
...       id:
...         identifier: true
...       range: uriorcurie
...   Person:
...     class_uri: schema:Person
...     is_a: NamedThing
...   Location:
...     class_uri: schema:Location
...     is_a: NamedThing
...   Weapon:
...     class_uri: dbpedia:Weapon
...     is_a: NamedThing
...   ClueHypothesis:
...     class_uri: prov:Action
...     tree_root: true
...     attributes:
...       person:   # e.g. Colonel Mustard
...         slot_uri: prov:wasAssociatedWith
...         range: Person
...         annotations:
...           grammar.main: TYPE_URIORCURIE
...       location: # e.g. Kitchen
...         slot_uri: prov:atLocation
...         range: Location
...         annotations:
...           grammar.main: TYPE_URIORCURIE
...       weapon:   # e.g. Candlestick
...         slot_uri: prov:used
...         range: Weapon
...         annotations:
...           grammar.main: TYPE_URIORCURIE
...     annotations:
...       grammar.main: >-
...         "<" person "in the" location "with the" weapon ">"
... """

Now parse and export to a file. This time the input string uses CURIEs to represent the different things in the Clue hypothesis.

>>> engine = DSLEngine()
>>> engine.load_schema(schema)
>>> obj = engine.parse_as_object("< clue:ColonelMustard in the clue:Kitchen with the clue:Candlestick >")
>>> import yaml
>>> with open("tests/output/clue-output.yaml", "w", encoding="utf-8") as f:
...     yaml.dump(obj.dict(), f)

From here we can use LinkML to convert to an RDF serialization:

cd clue-output.yaml
linkml-convert clue-output -s clue_model.yaml -t ttl

Results:

@prefix clue: <https://example.org/clue/> .
@prefix prov: <http://www.w3.org/ns/prov#> .

[] a prov:Action ;
    prov:atLocation clue:Kitchen ;
    prov:used clue:Candlestick ;
    prov:wasAssociatedWith clue:ColonelMustard .

Command Line Interface

semdsl --help

Limitations

Restricted to Lark grammars

Currently, semdsl only supports Lark grammars. The framework is designed to allow extensibility, e.g. to ANTLR, but this is currently unsupported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semdsl-0.0.1rc1.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

semdsl-0.0.1rc1-py3-none-any.whl (19.9 kB view details)

Uploaded Python 3

File details

Details for the file semdsl-0.0.1rc1.tar.gz.

File metadata

  • Download URL: semdsl-0.0.1rc1.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.9.16 Darwin/21.6.0

File hashes

Hashes for semdsl-0.0.1rc1.tar.gz
Algorithm Hash digest
SHA256 632180978d2f3bdf1cfba53952dc52dba935422632b0810f510b4bc60de8fd89
MD5 925c2acfaa5207949e4badae32cd5002
BLAKE2b-256 8a0aaebca01345b205692f695e51ba4beac04c2972f804101b4f5af8f3b4f5ed

See more details on using hashes here.

File details

Details for the file semdsl-0.0.1rc1-py3-none-any.whl.

File metadata

  • Download URL: semdsl-0.0.1rc1-py3-none-any.whl
  • Upload date:
  • Size: 19.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.9.16 Darwin/21.6.0

File hashes

Hashes for semdsl-0.0.1rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 e1c9ddd482a702a7ac54adef15fcfe301825a02fd43d036aed2c96a34141c0e8
MD5 9a9c97738b02e3a4ab178195a22647d1
BLAKE2b-256 31d0d8bbbd4beed259063c729213e02ec053313b36a124c5330941ec0b674955

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page