Skip to main content

Hybrid SPARQL query engine for timeseries data

Project description

chrontext: High-performance hybrid query engine for knowledge graphs and analytical data (e.g. time-series)

Chrontext allows you to use your knowledge graph to access large amounts of time-series or other analytical data. It uses a commodity SPARQL Triplestore and your existing data storage infrastructure. It currently supports time-series stored in a PostgreSQL-compatible Database such as DuckDB, Google Cloud BigQuery (SQL) and OPC UA HA, but can easily be extended to other APIs and databases. Chrontext Architecture

Chrontext forms a semantic layer that allows self-service data access, abstracting away technical infrastructure. Users can create query-based inputs for data products, that maintains these data products as the knowledge graph is maintained, and that can be deployed across heterogeneous on-premise and cloud infrastructures with the same API.

Chrontext is a high-performance Python library built in Rust using Polars, and relies heavily on packages from the Oxigraph project. Chrontext works with Apache Arrow, prefers time-series transport using Apache Arrow Flight and delivers results as Polars DataFrames.

Please reach out to Data Treehouse if you would like help trying Chrontext, or require support for a different database backend.

Installing

Chrontext is in pip, just use:

pip install chrontext

The API is documented HERE.

Example query in Python

The code assumes that we have a SPARQL-endpoint and BigQuery set up with time-series.

...
q = """
PREFIX xsd:<http://www.w3.org/2001/XMLSchema#>
PREFIX ct:<https://github.com/DataTreehouse/chrontext#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
PREFIX rds: <https://github.com/DataTreehouse/solar_demo/rds_power#> 
SELECT ?path ?t ?ts_pow_value ?ts_irr_value
WHERE {
    ?site a rds:Site;
    rdfs:label "Jonathanland";
    rds:functionalAspect ?block.
    # At the Block level there is an irradiation measurement:
    ?block a rds:A;
    ct:hasTimeseries ?ts_irr.
    ?ts_irr rdfs:label "RefCell1_Wm2".
    
    # At the Inverter level, there is a Power measurement
    ?block rds:functionalAspect+ ?inv.
    ?inv a rds:TBB;
    rds:path ?path;
    ct:hasTimeseries ?ts_pow.
    ?ts_pow rdfs:label "InvPDC_kW".
    
    ?ts_pow ct:hasDataPoint ?ts_pow_datapoint.
    ?ts_pow_datapoint ct:hasValue ?ts_pow_value;
        ct:hasTimestamp ?t.
    ?ts_irr ct:hasDataPoint ?ts_irr_datapoint.
    ?ts_irr_datapoint ct:hasValue ?ts_irr_value;
        ct:hasTimestamp ?t.
    FILTER(
        ?t >= "2018-08-24T12:00:00+00:00"^^xsd:dateTime && 
        ?t <= "2018-08-24T13:00:00+00:00"^^xsd:dateTime)
} ORDER BY ?path ?t 
"""
df = engine.query(q)

This produces the following DataFrame:

path t ts_pow_value ts_irr_value
str datetime[ns, UTC] f64 f64
=.A1.RG1.TBB1 2018-08-24 12:00:00 UTC 39.74 184.0
=.A1.RG1.TBB1 2018-08-24 12:00:01 UTC 39.57 184.0
=.A1.RG1.TBB1 2018-08-24 12:00:02 UTC 40.1 184.0
=.A1.RG1.TBB1 2018-08-24 12:00:03 UTC 40.05 184.0
=.A1.RG1.TBB1 2018-08-24 12:00:04 UTC 40.02 184.0
=.A5.RG9.TBB1 2018-08-24 12:59:56 UTC 105.5 427.5
=.A5.RG9.TBB1 2018-08-24 12:59:57 UTC 104.9 427.6
=.A5.RG9.TBB1 2018-08-24 12:59:58 UTC 105.6 428.0
=.A5.RG9.TBB1 2018-08-24 12:59:59 UTC 105.9 428.0
=.A5.RG9.TBB1 2018-08-24 13:00:00 UTC 105.7 428.5

API

The API is documented HERE.

Tutorial using DuckDB

In the following tutorial, we assume that you have a couple of CSV-files on disk that you want to query. We assume that you have DuckDB and chrontext installed, if not, do pip install chrontext duckdb. Installing chrontext will also install sqlalchemy, which we rely on to define the virtualized DuckDB tables.

CSV files

Our csv files look like this.

ts1.csv :

timestamp,value
2022-06-01T08:46:52,1
2022-06-01T08:46:53,10
..
2022-06-01T08:46:59,105

ts2.csv:

timestamp,value
2022-06-01T08:46:52,2
2022-06-01T08:46:53,20
...
2022-06-01T08:46:59,206

DuckDB setup:

We need to create a class with a method query that takes a SQL string its argument, returning a Polars DataFrame. In this class, we just hard code the DuckDB setup in the constructor.

import duckdb
import polars as pl

class MyDuckDB():
    def __init__(self):
        con = duckdb.connect()
        con.execute("SET TIME ZONE 'UTC';")
        con.execute("""CREATE TABLE ts1 ("timestamp" TIMESTAMPTZ, "value" INTEGER)""")
        ts_1 = pl.read_csv("ts1.csv", try_parse_dates=True).with_columns(pl.col("timestamp").dt.replace_time_zone("UTC"))
        con.append("ts1", df=ts_1.to_pandas())
        con.execute("""CREATE TABLE ts2 ("timestamp" TIMESTAMPTZ, "value" INTEGER)""")
        ts_2 = pl.read_csv("ts2.csv", try_parse_dates=True).with_columns(pl.col("timestamp").dt.replace_time_zone("UTC"))
        con.append("ts2", df=ts_2.to_pandas())
        self.con = con


    def query(self, sql:str) -> pl.DataFrame:
        # We execute the query and return it as a Polars DataFrame.
        # Chrontext expects this method to exist in the provided class.
        df = self.con.execute(sql).pl()
        return df

my_db = MyDuckDB()

Defining a virtualized SQL

We first define a sqlalchemy select query involving the two tables. Chrontext will modify this query when executing hybrid queries.

from sqlalchemy import MetaData, Table, Column, bindparam
metadata = MetaData()
ts1_table = Table(
    "ts1",
    metadata,
    Column("timestamp"),
    Column("value")
)
ts2_table = Table(
    "ts2",
    metadata,
    Column("timestamp"),
    Column("value")
)
ts1 = ts1_table.select().add_columns(
    bindparam("id1", "ts1").label("id"),
)
ts2 = ts2_table.select().add_columns(
    bindparam("id2", "ts2").label("id"),
)
sql = ts1.union(ts2)

Now, we are ready to define the virtualized backend. We will annotate nodes of the graph with a resource data property. These data properties will be linked to virtualized RDF triples in the DuckDB backend. The resource_sql_map decides which SQL is used for each resource property.

from chrontext import VirtualizedPythonDatabase

vdb = VirtualizedPythonDatabase(
    database=my_db,
    resource_sql_map={"my_resource": sql},
    sql_dialect="postgres"
)

The triple below will link the ex:myWidget1 to triples defined by the above sql.

ex:myWidget1 ct:hasResource "my_resource" . 

However, it will only be linked to those triples corresponding to rows where the identifier column equals the identifier associated with ex:myWidget1. Below, we define that ex:instanceA is only linked to those rows where the id column is ts1.

ex:myWidget1 ct:hasIdentifier "ts1" . 

In any such resource sql, the id column is mandatory.

Relating the Database to RDF Triples

Next, we want to relate the rows in this sql, each containing id, timestamp, value to RDF triples, using a template.

from chrontext import Prefix, Variable, Template, Parameter, RDFType, Triple, XSD
ct = Prefix("ct", "https://github.com/DataTreehouse/chrontext#")
xsd = XSD()
id = Variable("id")
timestamp = Variable("timestamp")
value = Variable("value")
dp = Variable("dp")
resources = {
    "my_resource": Template(
        iri=ct.suf("my_resource"),
        parameters=[
            Parameter(id, rdf_type=RDFType.Literal(xsd.string)),
            Parameter(timestamp, rdf_type=RDFType.Literal(xsd.dateTime)),
            Parameter(value, rdf_type=RDFType.Literal(xsd.double)),
        ],
        instances=[
            Triple(id, ct.suf("hasDataPoint"), dp),
            Triple(dp, ct.suf("hasValue"), value),
            Triple(dp, ct.suf("hasTimestamp"), timestamp)
        ]
)}

This means that our instance ex:myWidget1, will be associated with a value and a timestamp (and a blank data point) for each row in ts1.csv. For instance, the first row means we have:

ex:widget1 ct:hasDataPoint _:b1 .
_:b1 ct:hasTimestamp "2022-06-01T08:46:52Z"^^xsd:dateTime .
_:b1 ct:hasValue 1 .

Chrontext is created for those cases when this is infeasibly many triples, so we do not want to materialize them, but query them.

Creating the engine and querying:

The context for our analytical data (e.g. a model of an industrial asset) has to be stored in a SPARQL endpoint. In this case, we use an embedded Oxigraph engine that comes with chrontext. Now we assemble the pieces and create the engine.

from chrontext import Engine, SparqlEmbeddedOxigraph
oxigraph_store = SparqlEmbeddedOxigraph(rdf_file="my_graph.ttl", path="oxigraph_db_tutorial")
engine = Engine(
    resources,
    virtualized_python_database=vdb,
    sparql_embedded_oxigraph=oxigraph_store)
engine.init()

Now we can use our context to query the dataset. The aggregation below are pushed into DuckDB. The example below is a bit simple, but complex conditions can identify the ?w and ?s.

q = """
    PREFIX xsd:<http://www.w3.org/2001/XMLSchema#>
    PREFIX chrontext:<https://github.com/DataTreehouse/chrontext#>
    PREFIX types:<http://example.org/types#>
    SELECT ?w (SUM(?v) as ?sum_v) WHERE {
        ?w types:hasSensor ?s .
        ?s a types:ThingCounter .
        ?s chrontext:hasTimeseries ?ts .
        ?ts chrontext:hasDataPoint ?dp .
        ?dp chrontext:hasTimestamp ?t .
        ?dp chrontext:hasValue ?v .
        FILTER(?t > "2022-06-01T08:46:53Z"^^xsd:dateTime) .
    } GROUP BY ?w
    """
df = engine.query(q)
print(df)

This produces the following result:

w sum_v
str decimal[38,0]
http://example.org/case#myWidget1 1215
http://example.org/case#myWidget2 1216

Roadmap in brief

Let us know if you have suggestions!

Stabilization

Chrontext will be put into use in the energy industry during the period, and will be stabilized as part of this process. We are very interested in your bug reports!

Support for Azure Data Explorer / KustoQL

We are likely adding support for ADX/KustoQL. Let us know if this is something that would be useful for you.

Support for Databricks SQL

We are likely adding support for Databricks SQL as the virtualization backend.

Generalization to analytical data (not just time series!)

While chrontext is currently focused on time series data, we are incrementally adding support for contextualization of arbitrary analytical data.

Support for multiple databases

Currently, we only support one database backend at a given time. We plan to support hybrid queries across multiple virtualized databases.

References

Chrontext is joint work by Magnus Bakken and Professor Ahmet Soylu at OsloMet. To read more about Chrontext, read the article Chrontext: Portable Sparql Queries Over Contextualised Time Series Data in Industrial Settings.

License

All code produced since August 1st. 2023 is copyrighted to Data Treehouse AS with an Apache 2.0 license unless otherwise noted.

All code which was produced before August 1st. 2023 copyrighted to Prediktor AS with an Apache 2.0 license unless otherwise noted, and has been financed by The Research Council of Norway (grant no. 316656) and Prediktor AS as part of a PhD Degree. The code at this state is archived in the repository at https://github.com/DataTreehouse/chrontext.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chrontext-0.9.6.tar.gz (183.4 kB view details)

Uploaded Source

Built Distributions

chrontext-0.9.6-cp311-none-win_amd64.whl (25.6 MB view details)

Uploaded CPython 3.11 Windows x86-64

chrontext-0.9.6-cp311-cp311-manylinux_2_28_x86_64.whl (30.7 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.28+ x86-64

chrontext-0.9.6-cp311-cp311-macosx_12_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.11 macOS 12.0+ ARM64

chrontext-0.9.6-cp311-cp311-macosx_11_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

chrontext-0.9.6-cp310-none-win_amd64.whl (25.6 MB view details)

Uploaded CPython 3.10 Windows x86-64

chrontext-0.9.6-cp310-cp310-manylinux_2_28_x86_64.whl (30.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.28+ x86-64

chrontext-0.9.6-cp310-cp310-macosx_12_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.10 macOS 12.0+ ARM64

chrontext-0.9.6-cp310-cp310-macosx_11_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

chrontext-0.9.6-cp39-none-win_amd64.whl (25.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

chrontext-0.9.6-cp39-cp39-manylinux_2_28_x86_64.whl (30.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.28+ x86-64

chrontext-0.9.6-cp39-cp39-macosx_12_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.9 macOS 12.0+ ARM64

chrontext-0.9.6-cp39-cp39-macosx_11_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

chrontext-0.9.6-cp38-none-win_amd64.whl (25.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

chrontext-0.9.6-cp38-cp38-manylinux_2_28_x86_64.whl (30.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.28+ x86-64

chrontext-0.9.6-cp38-cp38-macosx_12_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.8 macOS 12.0+ ARM64

chrontext-0.9.6-cp38-cp38-macosx_11_0_arm64.whl (22.6 MB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

File details

Details for the file chrontext-0.9.6.tar.gz.

File metadata

  • Download URL: chrontext-0.9.6.tar.gz
  • Upload date:
  • Size: 183.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.7.0

File hashes

Hashes for chrontext-0.9.6.tar.gz
Algorithm Hash digest
SHA256 3fbcc63e819ddb857983bd56e000f98c36bfb1177dcf1c4209ab6dcbd12e0b14
MD5 f80a861472571ab5ba01d8fbf4492296
BLAKE2b-256 ba0da279e770fbed04cae558e8d5e5bfeeaa023863ea817c7782ac0150ade63e

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp311-none-win_amd64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 3ce1cdfd55cba64fd3cea0c234e436ddcb1d8000ef38acf88b20917ba2ca0256
MD5 748eaa462bbf0f88661530504f3e9ec6
BLAKE2b-256 5895c270d14983c96309a1e2ab54cc53c420cfb356784eb0bbbeda4dc3520100

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9987d6b742e88d3feaad4cbff53fa929971152c858f23807a182a632cc7adc40
MD5 f4cdceaa9349cea7d1bdcb3a86f484ac
BLAKE2b-256 e12dfd2db77273bd09caceb5f699b1efeb419dc15c683b382892c8f7454562c6

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp311-cp311-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp311-cp311-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 141404f067acbc086e769423a65e37caf8f5db4a709e1ac2af835a07f4f5a430
MD5 c68f412640471aa47e72fc2dbf3b89c8
BLAKE2b-256 4d3abeaaa8708c36712835347a6c1d7b7039a7cb5440a63020c98542aa0b253f

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 559051385a95b2a07e819eb70713f383772dfdbfd3dd93c29c1d640aebca580b
MD5 c92ac0815c1996beee4834bf40a63520
BLAKE2b-256 a8b891b4e6f799dd556ba7f3149d5c1df884de26965d976c3a7c04e7c56edf0f

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 4f50201fbf16d0146458de0e22f6618aa24d455749e94b7b6d7981aba93d7b57
MD5 3ad7e54d9a5c3e76796f1fe1dd9e7629
BLAKE2b-256 b86ebaf32651f5dd7f61b6971062efb4e9416b3291f769ae33e95775cdf8489c

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3eb312915325c7abc625ab12dec21b3e3b57514d100d159a8df1e5915026db23
MD5 3d0fc1d01f9e3bf9eeeba2f4ea456bec
BLAKE2b-256 51549b2553a4472a6faf6c238e18bf3930a6eb74ae3d32c1130fab6c5996347c

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp310-cp310-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp310-cp310-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 fdb1293e35f29bc605b1871b5f4fee523829743ac428c340a935e6f8cd726416
MD5 f5562b450b5694c7d1f33cd85a972bde
BLAKE2b-256 3d884a03a779734f5997d99034d3353a20b9b8a887c5f85f6121464072aa033b

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 85c4be4306f47be05960ef9deecdd0498588ba8cf321faed7750d7d78e64b350
MD5 df5f5e6a137d66b2603040a710dae5bc
BLAKE2b-256 9e489b18e08a638e7b19bf99e58776efadfff1222085c483b1d3e334d7792cda

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp39-none-win_amd64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 1c2405afd9046f04dd85ec27239114688c296132ca29177a5e408505ff6a6fc1
MD5 6a1d75f89f13cf728fd0be68afa6716d
BLAKE2b-256 0c16d7e328df402f5002bdca6d9d6bdce40e1f93e6a29109f6ee5ef3519cac92

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a47195e3eec791643cd6b4b760d75fc56ff756a536107ab8f32afc31e4e7d2d4
MD5 9fbc2c2388ea6c923d081f6e96b01d54
BLAKE2b-256 24fc2e49e3fc21ba3baa909db0c3eeb76c197d744cbd00318957957c4e8f8b4b

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp39-cp39-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp39-cp39-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 102be4f00d447ba3e62cff3f088cac9b4385fad2a443b425ddd59b0432b21077
MD5 641a4ad2ee55bbced804260bd739ab27
BLAKE2b-256 55dce132f6dfe7fc310881ac667b8cbe5b29e581eedea09af40f05012f600722

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4844ffae3596c0de189be0a0e1dbe129daec592cecd86e567b42f7f3bb80c1cc
MD5 7a204789a60ff49738407d50cd89add4
BLAKE2b-256 a18344456725fa805b18d241e84552db12a4aa41e86c392757618360e1911050

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp38-none-win_amd64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp38-none-win_amd64.whl
Algorithm Hash digest
SHA256 58a581087297c608abae5d92dfc17a52b5e23c30eb8a0a0935e5dc75968758ce
MD5 d2fd02da6edd2eeda0cc4999424b20a9
BLAKE2b-256 69690832e1bb13bfeecd9ea05bfee315be31c046ed314280cc440ada0609aa3e

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp38-cp38-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp38-cp38-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2f3c05413402bf7b51001a42885a5c207a6f201a8e1e16bf106f607eb68b3757
MD5 07dcdfbbad4fd7f55a385838869440c4
BLAKE2b-256 474fa03c59b8b6fea6b77b425f9917c01a1a543cb643670708321ce91dd79702

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp38-cp38-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp38-cp38-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 fd22351c8fe4e92d1ba5118bdc260c9330599ca962080a7fc991ec2e7064411b
MD5 eafff0de27f6734865fabd73ed14305e
BLAKE2b-256 81e4373c11a279c0bc00407d2886e2aac4d53d4010c1f85ad9b97ff1988f005f

See more details on using hashes here.

File details

Details for the file chrontext-0.9.6-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chrontext-0.9.6-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c611a442aba76550f708a9fd9732f2d348c9c94e706c88c8f75b179b1aaed871
MD5 ada69aee43e29c730b4bd64108715f82
BLAKE2b-256 8357c31c1cb76517ada87ea711393ddb5e1fe6f9ebbb67eeda0d9c353aeb7277

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page