Skip to main content

Pyhton library to build Event Knowledge Graphs

Project description

Using Graph Databases to Create an Event Knowledge Graph

Description

This repository collects queries for modeling and importing incomplete event data as Event Knowledge Graphs using the Labeled Property Graph data model of graph databases. All scripts and queries are licensed under LGPL v3.0, see LICENSE. Copyright information is provided within each Project.

Requirements

This repository should be used as a submodule in another project to build an event knowledge graph. To add the submodule to your project, run the following command in the root folder of your project. git submodule https://github.com/Ava-S/ekg_creator

For more information about submodule, have a look at Git submodules. For example projects that use this repository, have a look at EKG BPI Challenges, EKG Inferring missing identifiers and EKG for AutoTwin EU GA n. 101092021.

Furthermore, we assume the following packages/databases to be installed.

Neo4j

Install the neo4j-python-driver

pip install neo4j OR conda install -c conda-forge neo4j-python-driver

Install Neo4j:

Other packages

  • numpy
  • pandas
  • tabulate
  • tqdm

Get started

Create a new graph database

  • The scripts in this release assume password "12345678".
  • The scripts assume the server to be available at the default URL bolt://localhost:7687
    • You can modify this also in the script.
  • ensure to allocate enough memory to your database, advised: dbms.memory.heap.max_size=5G
  • the script expects the Neo4j APOC library to be installed as a plugin, see https://neo4j.com/labs/apoc/

Projects

The following projects are part of this repository

Missing Case Identifiers Inference

Method to infer missing case identifiers in event data by exploiting knowledge about the activities and their locations.

Semantic Header (json files)

First version for semantic header for system/event knowledge graphs: https://multiprocessmining.org/2022/10/26/data-storage-vs-data-semantics-for-object-centric-event-data/

Event Knowledge Graphs

Data model and generic query templates for translating and integrating a set of related CSV event logs into single event graph over multiple behavioral dimensions, stored as labeled property graph in Neo4J. See csv_to_eventgraph_neo4j/README.txt

Publications:

Scripts of submodule

Main script

There is one script (orchestrator) that is used by applications to create an Event Knowledge graph. This script makes use of this submodule.

Data_managers

  • data_managers/datastructures.py --> transforms the JSON file describing the different datasets into a class + additional methods
  • data_managers/semantic_header.py --> transforms the JSON file describing the semantic header into a class + additional methods
  • data_managers/interpreters.py --> Class that contains information about in what query language the semantic header and data structures should be interpreter

Database_managers

  • database_managers/authentication.py --> class containing the credentials to create connection to database. Local credentials are includes. In case you want to create a remote connection, add the following piece of code to a (gitignored) file.
remote = Credentials(
    uri="[your_uri]",
    user="neo4j",
    password="[your_password]"
)
  • database_managers/db_connection.py --> class responsible for making the connection to the database and to communicate with the database
  • database_managers/EventKnowledgeGraph.py --> class responsible for making (changes to) the EKG and to request data from the EKG. Makes use of several modules.

EKG_Modules

  • ekg_modules/db_management.py --> general module to manage the database
  • ekg_modules/data_importer.py --> imports the data stored in the records into the EKG
  • ekg_modules/ekg_builder_semantic_header.py --> creates the required nodes and relations as specified in the semantic header
  • ekg_modules/inference_engine.py --> module responsible for inferring missing information
  • ekg_modules/ekg_analysis.py --> module for analysis of the EKG (e.g. create process model)
  • ekg_modules/ekg_custom_module.py --> module to create custom queries, specific for this example

CypherQueries

Contains repeatable pieces of Cypher Queries for all necessary parts.

  • cypher_queries/query_translators --> translate semantic header and data structures into Cypher
  • cypher_queries/query_library --> contains all cypher queries for the EKG modules
  • cypher_queries/custom_query_library --> contains all custom cypher queries for this example for the EKG modules

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promg-0.1.25.tar.gz (38.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

promg-0.1.25-py3-none-any.whl (43.8 kB view details)

Uploaded Python 3

File details

Details for the file promg-0.1.25.tar.gz.

File metadata

  • Download URL: promg-0.1.25.tar.gz
  • Upload date:
  • Size: 38.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for promg-0.1.25.tar.gz
Algorithm Hash digest
SHA256 ef1ac98cec1004c1e34923363b9f3ca606c30846270bc6e3ac5099c5a4ef0a55
MD5 67f74e5020a02a96e2209597fa0c6b8b
BLAKE2b-256 2f6d30fe70dec0232badc85dca7bd9e5a382893dbbf32a87143f7ecd602166ad

See more details on using hashes here.

File details

Details for the file promg-0.1.25-py3-none-any.whl.

File metadata

  • Download URL: promg-0.1.25-py3-none-any.whl
  • Upload date:
  • Size: 43.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for promg-0.1.25-py3-none-any.whl
Algorithm Hash digest
SHA256 e69adb1960a75f0572652e74203a82645a6aa6ea44b5b8a298744a865490b593
MD5 97d5e0af8abaade6c1f7fe7b86c04efd
BLAKE2b-256 c7a6674f89c3321fd57499791fac9929ddc2283bd915e7b7fdf342cafd149d86

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page