Skip to main content

Hybrid SPARQL query engine for timeseries data

Project description

chrontext: High-performance hybrid query engine for knowledge graphs and analytical data (e.g. time-series)

Chrontext allows you to use your knowledge graph to access large amounts of time-series or other analytical data. It uses a commodity SPARQL Triplestore and your existing data storage infrastructure. It currently supports time-series stored in a PostgreSQL-compatible Database such as DuckDB, Google Cloud BigQuery (SQL) and OPC UA HA, but can easily be extended to other APIs and databases. Chrontext Architecture

Chrontext forms a semantic layer that allows self-service data access, abstracting away technical infrastructure. Users can create query-based inputs for data products, that maintains these data products as the knowledge graph is maintained, and that can be deployed across heterogeneous on-premise and cloud infrastructures with the same API.

Chrontext is a high-performance Python library built in Rust using Polars, and relies heavily on packages from the Oxigraph project. Chrontext works with Apache Arrow, prefers time-series transport using Apache Arrow Flight and delivers results as Polars DataFrames.

Please reach out to Data Treehouse if you would like help trying Chrontext, or require support for a different database backend.

Installing

Chrontext is in pip, just use:

pip install chrontext

The API is documented HERE.

Example query in Python

The code assumes that we have a SPARQL-endpoint and BigQuery set up with time-series.

...
q = """
PREFIX xsd:<http://www.w3.org/2001/XMLSchema#>
PREFIX ct:<https://github.com/DataTreehouse/chrontext#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
PREFIX rds: <https://github.com/DataTreehouse/solar_demo/rds_power#> 
SELECT ?path ?t ?ts_pow_value ?ts_irr_value
WHERE {
    ?site a rds:Site;
    rdfs:label "Jonathanland";
    rds:functionalAspect ?block.
    # At the Block level there is an irradiation measurement:
    ?block a rds:A;
    ct:hasTimeseries ?ts_irr.
    ?ts_irr rdfs:label "RefCell1_Wm2".
    
    # At the Inverter level, there is a Power measurement
    ?block rds:functionalAspect+ ?inv.
    ?inv a rds:TBB;
    rds:path ?path;
    ct:hasTimeseries ?ts_pow.
    ?ts_pow rdfs:label "InvPDC_kW".
    
    ?ts_pow ct:hasDataPoint ?ts_pow_datapoint.
    ?ts_pow_datapoint ct:hasValue ?ts_pow_value;
        ct:hasTimestamp ?t.
    ?ts_irr ct:hasDataPoint ?ts_irr_datapoint.
    ?ts_irr_datapoint ct:hasValue ?ts_irr_value;
        ct:hasTimestamp ?t.
    FILTER(
        ?t >= "2018-08-24T12:00:00+00:00"^^xsd:dateTime && 
        ?t <= "2018-08-24T13:00:00+00:00"^^xsd:dateTime)
} ORDER BY ?path ?t 
"""
df = engine.query(q)

This produces the following DataFrame:

path t ts_pow_value ts_irr_value
str datetime[ns, UTC] f64 f64
=.A1.RG1.TBB1 2018-08-24 12:00:00 UTC 39.74 184.0
=.A1.RG1.TBB1 2018-08-24 12:00:01 UTC 39.57 184.0
=.A1.RG1.TBB1 2018-08-24 12:00:02 UTC 40.1 184.0
=.A1.RG1.TBB1 2018-08-24 12:00:03 UTC 40.05 184.0
=.A1.RG1.TBB1 2018-08-24 12:00:04 UTC 40.02 184.0
=.A5.RG9.TBB1 2018-08-24 12:59:56 UTC 105.5 427.5
=.A5.RG9.TBB1 2018-08-24 12:59:57 UTC 104.9 427.6
=.A5.RG9.TBB1 2018-08-24 12:59:58 UTC 105.6 428.0
=.A5.RG9.TBB1 2018-08-24 12:59:59 UTC 105.9 428.0
=.A5.RG9.TBB1 2018-08-24 13:00:00 UTC 105.7 428.5

API

The API is documented HERE.

Tutorial using DuckDB

In the following tutorial, we assume that you have a couple of CSV-files on disk that you want to query. We assume that you have DuckDB and chrontext installed, if not, do pip install chrontext duckdb. Installing chrontext will also install sqlalchemy, which we rely on to define the virtualized DuckDB tables.

CSV files

Our csv files look like this.

ts1.csv :

timestamp,value
2022-06-01T08:46:52,1
2022-06-01T08:46:53,10
..
2022-06-01T08:46:59,105

ts2.csv:

timestamp,value
2022-06-01T08:46:52,2
2022-06-01T08:46:53,20
...
2022-06-01T08:46:59,206

DuckDB setup:

We need to create a class with a method query that takes a SQL string its argument, returning a Polars DataFrame. In this class, we just hard code the DuckDB setup in the constructor.

import duckdb
import polars as pl

class MyDuckDB():
    def __init__(self):
        con = duckdb.connect()
        con.execute("SET TIME ZONE 'UTC';")
        con.execute("""CREATE TABLE ts1 ("timestamp" TIMESTAMPTZ, "value" INTEGER)""")
        ts_1 = pl.read_csv("ts1.csv", try_parse_dates=True).with_columns(pl.col("timestamp").dt.replace_time_zone("UTC"))
        con.append("ts1", df=ts_1.to_pandas())
        con.execute("""CREATE TABLE ts2 ("timestamp" TIMESTAMPTZ, "value" INTEGER)""")
        ts_2 = pl.read_csv("ts2.csv", try_parse_dates=True).with_columns(pl.col("timestamp").dt.replace_time_zone("UTC"))
        con.append("ts2", df=ts_2.to_pandas())
        self.con = con


    def query(self, sql:str) -> pl.DataFrame:
        # We execute the query and return it as a Polars DataFrame.
        # Chrontext expects this method to exist in the provided class.
        df = self.con.execute(sql).pl()
        return df

my_db = MyDuckDB()

Defining a virtualized SQL

We first define a sqlalchemy select query involving the two tables. Chrontext will modify this query when executing hybrid queries.

from sqlalchemy import MetaData, Table, Column, bindparam
metadata = MetaData()
ts1_table = Table(
    "ts1",
    metadata,
    Column("timestamp"),
    Column("value")
)
ts2_table = Table(
    "ts2",
    metadata,
    Column("timestamp"),
    Column("value")
)
ts1 = ts1_table.select().add_columns(
    bindparam("id1", "ts1").label("id"),
)
ts2 = ts2_table.select().add_columns(
    bindparam("id2", "ts2").label("id"),
)
sql = ts1.union(ts2)

Now, we are ready to define the virtualized backend. We will annotate nodes of the graph with a resource data property. These data properties will be linked to virtualized RDF triples in the DuckDB backend. The resource_sql_map decides which SQL is used for each resource property.

from chrontext import VirtualizedPythonDatabase

vdb = VirtualizedPythonDatabase(
    database=my_db,
    resource_sql_map={"my_resource": sql},
    sql_dialect="postgres"
)

The triple below will link the ex:myWidget1 to triples defined by the above sql.

ex:myWidget1 ct:hasResource "my_resource" . 

However, it will only be linked to those triples corresponding to rows where the identifier column equals the identifier associated with ex:myWidget1. Below, we define that ex:instanceA is only linked to those rows where the id column is ts1.

ex:myWidget1 ct:hasIdentifier "ts1" . 

In any such resource sql, the id column is mandatory.

Relating the Database to RDF Triples

Next, we want to relate the rows in this sql, each containing id, timestamp, value to RDF triples, using a template.

from chrontext import Prefix, Variable, Template, Parameter, RDFType, Triple, XSD
ct = Prefix("ct", "https://github.com/DataTreehouse/chrontext#")
xsd = XSD()
id = Variable("id")
timestamp = Variable("timestamp")
value = Variable("value")
dp = Variable("dp")
resources = {
    "my_resource": Template(
        iri=ct.suf("my_resource"),
        parameters=[
            Parameter(id, rdf_type=RDFType.Literal(xsd.string)),
            Parameter(timestamp, rdf_type=RDFType.Literal(xsd.dateTime)),
            Parameter(value, rdf_type=RDFType.Literal(xsd.double)),
        ],
        instances=[
            Triple(id, ct.suf("hasDataPoint"), dp),
            Triple(dp, ct.suf("hasValue"), value),
            Triple(dp, ct.suf("hasTimestamp"), timestamp)
        ]
)}

This means that our instance ex:myWidget1, will be associated with a value and a timestamp (and a blank data point) for each row in ts1.csv. For instance, the first row means we have:

ex:widget1 ct:hasDataPoint _:b1 .
_:b1 ct:hasTimestamp "2022-06-01T08:46:52Z"^^xsd:dateTime .
_:b1 ct:hasValue 1 .

Chrontext is created for those cases when this is infeasibly many triples, so we do not want to materialize them, but query them.

Creating the engine and querying:

The context for our analytical data (e.g. a model of an industrial asset) has to be stored in a SPARQL endpoint. In this case, we use an embedded Oxigraph engine that comes with chrontext. Now we assemble the pieces and create the engine.

from chrontext import Engine, SparqlEmbeddedOxigraph
oxigraph_store = SparqlEmbeddedOxigraph(rdf_file="my_graph.ttl", path="oxigraph_db_tutorial")
engine = Engine(
    resources,
    virtualized_python_database=vdb,
    sparql_embedded_oxigraph=oxigraph_store)
engine.init()

Now we can use our context to query the dataset. The aggregation below are pushed into DuckDB. The example below is a bit simple, but complex conditions can identify the ?w and ?s.

q = """
    PREFIX xsd:<http://www.w3.org/2001/XMLSchema#>
    PREFIX chrontext:<https://github.com/DataTreehouse/chrontext#>
    PREFIX types:<http://example.org/types#>
    SELECT ?w (SUM(?v) as ?sum_v) WHERE {
        ?w types:hasSensor ?s .
        ?s a types:ThingCounter .
        ?s chrontext:hasTimeseries ?ts .
        ?ts chrontext:hasDataPoint ?dp .
        ?dp chrontext:hasTimestamp ?t .
        ?dp chrontext:hasValue ?v .
        FILTER(?t > "2022-06-01T08:46:53Z"^^xsd:dateTime) .
    } GROUP BY ?w
    """
df = engine.query(q)
print(df)

This produces the following result:

w sum_v
str decimal[38,0]
http://example.org/case#myWidget1 1215
http://example.org/case#myWidget2 1216

Roadmap in brief

Let us know if you have suggestions!

Stabilization

Chrontext will be put into use in the energy industry during the period, and will be stabilized as part of this process. We are very interested in your bug reports!

Support for Azure Data Explorer / KustoQL

We are likely adding support for ADX/KustoQL. Let us know if this is something that would be useful for you.

Support for Databricks SQL

We are likely adding support for Databricks SQL as the virtualization backend.

Generalization to analytical data (not just time series!)

While chrontext is currently focused on time series data, we are incrementally adding support for contextualization of arbitrary analytical data.

Support for multiple databases

Currently, we only support one database backend at a given time. We plan to support hybrid queries across multiple virtualized databases.

References

Chrontext is joint work by Magnus Bakken and Professor Ahmet Soylu at OsloMet. To read more about Chrontext, read the article Chrontext: Portable Sparql Queries Over Contextualised Time Series Data in Industrial Settings.

License

All code produced since August 1st. 2023 is copyrighted to Data Treehouse AS with an Apache 2.0 license unless otherwise noted.

All code which was produced before August 1st. 2023 copyrighted to Prediktor AS with an Apache 2.0 license unless otherwise noted, and has been financed by The Research Council of Norway (grant no. 316656) and Prediktor AS as part of a PhD Degree. The code at this state is archived in the repository at https://github.com/DataTreehouse/chrontext.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chrontext-0.9.5.tar.gz (183.2 kB view details)

Uploaded Source

Built Distributions

chrontext-0.9.5-cp311-none-win_amd64.whl (25.6 MB view details)

Uploaded CPython 3.11 Windows x86-64

chrontext-0.9.5-cp311-cp311-manylinux_2_28_x86_64.whl (30.7 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.28+ x86-64

chrontext-0.9.5-cp311-cp311-macosx_12_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.11 macOS 12.0+ ARM64

chrontext-0.9.5-cp311-cp311-macosx_11_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

chrontext-0.9.5-cp310-none-win_amd64.whl (25.6 MB view details)

Uploaded CPython 3.10 Windows x86-64

chrontext-0.9.5-cp310-cp310-manylinux_2_28_x86_64.whl (30.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.28+ x86-64

chrontext-0.9.5-cp310-cp310-macosx_12_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.10 macOS 12.0+ ARM64

chrontext-0.9.5-cp310-cp310-macosx_11_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

chrontext-0.9.5-cp39-none-win_amd64.whl (25.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

chrontext-0.9.5-cp39-cp39-manylinux_2_28_x86_64.whl (30.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.28+ x86-64

chrontext-0.9.5-cp39-cp39-macosx_12_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.9 macOS 12.0+ ARM64

chrontext-0.9.5-cp39-cp39-macosx_11_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

chrontext-0.9.5-cp38-none-win_amd64.whl (25.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

chrontext-0.9.5-cp38-cp38-manylinux_2_28_x86_64.whl (30.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.28+ x86-64

chrontext-0.9.5-cp38-cp38-macosx_12_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.8 macOS 12.0+ ARM64

chrontext-0.9.5-cp38-cp38-macosx_11_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

File details

Details for the file chrontext-0.9.5.tar.gz.

File metadata

  • Download URL: chrontext-0.9.5.tar.gz
  • Upload date:
  • Size: 183.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.7.0

File hashes

Hashes for chrontext-0.9.5.tar.gz
Algorithm Hash digest
SHA256 2e8397eed8d674bef4e5d2a90e2e85d8c5018f083000397d644d1165abca0d71
MD5 ce1070e352953df2b6d1411994bd46d9
BLAKE2b-256 5751781ba7ad44cfe461342269dbba7d838d3061bd4c8e949bd0228d461845a8

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp311-none-win_amd64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 dd5587b79278eda2f4ce2d29a2c739888b1b2fd407de62aa9eae731c93f3e0f3
MD5 020b67fe079dd68920fa5aee00bd7e95
BLAKE2b-256 d6d21e110f2dbaf294eb3717836c74998417763f679cd97788388e71f7859ef1

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8e65b9652ae39e63de84f28b9e90589518dd6175f4c70b7f5abbdd5f48707520
MD5 c055ff49303c155d42b6111a01171ec6
BLAKE2b-256 e0689b378d35c8cf4c48cb196d5dc2457d2dfb96d81122d767fd20aba9ff0c44

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp311-cp311-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp311-cp311-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 94c5502279a65b8830d9f687547a5f785a111b671a3be7d93adf771a03a32806
MD5 9bfe7bdabe34ed4e637e95084447bf37
BLAKE2b-256 86f2dcb5f9c04839da0e5d4dc8bcdb0a269bdeeef5ea669420ccec4e1de7b05c

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4b30a5ffa6ea22bc4d98d7479836c9115de3c2b6561831eb787de6ad36bcd320
MD5 78f099f2978d53b46b66be5a11debce4
BLAKE2b-256 37c3cb9b960cc7e664f6ea65e524f3132f65f14578e55495a3d9913aeb38e8d6

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 701c508b4a254bb7a6d5d8d4c6c0fe3642db78edf40b7e1857b6414d240e3e3d
MD5 8b43749ac06d57608a27a58c743f1c0e
BLAKE2b-256 5e6527363e2f09b65721ff659ed2df2206f70b1abd6fe5d28dc06cbcd0f79136

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 aebd01c3e6222260d1a5632176c47171648f0ccbe791d0c85a04382fc908e383
MD5 0b830e3b1ece934fe0ae7e0b391345f1
BLAKE2b-256 c983e18732cdb2a7afbeab488448aab0e8ef2170529a1de82c823c664ad03312

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp310-cp310-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp310-cp310-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 de15a9e29eab20bfdb01f3e05f3f95d9d13ca69a16ed8fed2a62fc94682540fc
MD5 05e7c7007ad712c15692658c467102eb
BLAKE2b-256 91a1d821a3b011aa84f38e4d6ee91e03b3326efd175548fc516568b5aef6de64

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9b5e64861ab1b67fbbae1d30b71c2fcbc7f20d7fdeb6f772ef7345bf854bf1b9
MD5 696e580977f409a70a2683beea90aeea
BLAKE2b-256 6ca373883dff4a45dd4b04ec9cda528bc96839edf78c2ebbea42ba035e108f8b

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp39-none-win_amd64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 5c6f8b7df81e30076638c1d350fe3953bb91d3985792e45f6c94f3964308de5b
MD5 f75c7c220723cfaf42b7d9ed002285de
BLAKE2b-256 c3657b21df6fbae5f8027b3966851d19e7571f090f9b94b86622886c206f7c68

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 218c176dce2c4d08245adb0e0e2915751b0c562dbe5164ba2cbf6eddcef4c426
MD5 dc2c8b7387d1c316302a30b9578c5b52
BLAKE2b-256 4661c8a60525170e53e8daa5e6c27107256d4438ace586faed52ac4a6b65116f

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp39-cp39-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp39-cp39-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 ff37bd51ca049bcc7265639747a5f902c2bb0c831f8da202bf11fdb9f30f748f
MD5 da92757546f612a1ce6782fb2e015e52
BLAKE2b-256 2abcd1dfb01ee67150495c81a7346d245655317569ce443826c9f4a84ea2a2fc

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 95cc4acf1e14a60e76c0838e80df1fddafbd498c94b3a871524485f9f3737b2b
MD5 a464f6ef65f96fa52f693fddb839dbba
BLAKE2b-256 510f7719f0febfe57bc658d16646799abbf533b185db3a214bce264d1193c24b

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp38-none-win_amd64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp38-none-win_amd64.whl
Algorithm Hash digest
SHA256 da9628d8e60478c26609accd775c377ebb6edc5201f76a5ae64af0fb73fce5e0
MD5 dc66ce45e9c822f0506d7f75981b34f2
BLAKE2b-256 ca297bf78010b503e9a14fb2be4e018d470f04486f58e96617f1fbffd3e9bc54

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp38-cp38-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp38-cp38-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 041df4ec02a811673951ed68dae06c5b9890ad0cf6293201828ae14350562286
MD5 963334d4a3195cb42d8cf2d4d5853e03
BLAKE2b-256 433be39963340b3086f4c838be773342bb986e1d869e7305a2fe8c458886050b

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp38-cp38-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp38-cp38-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 79804e04c372aae199ce8d1714b74a00d0dc9983d3856d9f2197800e96dc9583
MD5 3731f7fc9434916e3b6e9f3261345f87
BLAKE2b-256 08644a1fc61398702bb0db02a49628c265f39ee3990b21b0d5bb2d4aeea7be1f

See more details on using hashes here.

File details

Details for the file chrontext-0.9.5-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.5-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 daf72101cdf9d6391f83206889a6a7cc0ed9db038949e08eeff4b65ccf75ad46
MD5 5ee30e01d1ef215fdbaca090178e0c7f
BLAKE2b-256 3d0b36260b3c318ff34723d91195a3050a1b2e05a37a238a5fd7657703f316f2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page