Skip to main content

Library for executable ML pipelines represented by KGs.

Project description


PyPI Python Poetry Code style: black License

Python library for conveniently constructing and executing Machine Learning (ML) pipelines represented by Knowledge Graphs (KGs). It features a coding interface and a CLI, and allows the user to:

  1. Construct an ML pipeline that gets a CSV as input and processes the data using any of the available tasks and methods.
  2. Save the constructed pipeline as a KG in Turtle format.
  3. Execute the generated KG.

The coding interface is demonstrated with three sample Python files. The pipelines represented by the generated sample KGs are briefly explained below:

  1. ML pipeline: Loads features and labels from an input CSV dataset, splits the data, trains and tests a k-NN model, and visualizes the prediction errors.
  2. Statistics pipeline: Loads a feature from an input CSV dataset, normalizes it, and plots its values (before and after normalization) using a scatter plot.
  3. Visualization pipeline: Loads a feature from an input CSV dataset and plots its values using a line plot.

Under the hood, ExeKGLib uses well-known Python libraries for data processing and visualization and performing predictions such as pandas, matplotlib, and scikit-learn.

ExeKGLib is part of the following paper submitted to ISWC 2023:
Klironomos A., Zhou B., Tan Z., Zheng Z., Gad-Elrab M., Paulheim H., Kharlamov E.: ExeKGLib: A Python Library for Machine Learning Analytics based on Knowledge Graphs

Detailed information (installation, documentation etc.) about ExeKGLib can be found in its website and basic information is shown below.


To install, run pip install exe-kg-lib.

For detailed installation instructions, refer to the installation page of ExeKGLib's website.

Ready-to-use ML-related tasks and methods

Click to expand
KG schema (abbreviation) Task Method Properties Input (data structure) Output (data structure) Implemented by Python class
Machine Learning (ml) Train KNNTrain - DataInTrainX (Matrix or Vector)
DataInTrainY (Matrix or Vector)
DataOutPredictedValueTrain (Matrix or Vector)
DataOutTrainModel (SingleValue)
Machine Learning (ml) Train MLPTrain - DataInTrainX (Matrix or Vector)
DataInTrainY (Matrix or Vector)
DataOutPredictedValueTrain (Matrix or Vector)
DataOutTrainModel (SingleValue)
Machine Learning (ml) Train LRTrain - DataInTrainX (Matrix or Vector)
DataInTrainY (Matrix or Vector)
DataOutPredictedValueTrain (Matrix or Vector)
DataOutTrainModel (SingleValue)
Machine Learning (ml) Test KNNTest - DataInTestModel (SingleValue)
DataInTestX (Matrix or Vector)
DataOutPredictedValueTest (Matrix or Vector) TestKNNTest
Machine Learning (ml) Test MLPTest - DataInTestModel (SingleValue)
DataInTestX (Matrix or Vector)
DataOutPredictedValueTest (Matrix or Vector) TestMLPTest
Machine Learning (ml) Test LRTest - DataInTestModel (SingleValue)
DataInTestX (Matrix or Vector)
DataOutPredictedValueTest (Matrix or Vector) TestLRTest
Machine Learning (ml) PerformanceCalculation PerformanceCalculationMethod - DataInTrainRealY (Matrix or Vector)
DataInTrainPredictedY (Matrix or Vector)
DataInTestPredictedY (Matrix or Vector)
DataInTestRealY (Matrix or Vector)
DataOutMLTestErr (Vector)
DataOutMLTrainErr (Vector)
Machine Learning (ml) Concatenation ConcatenationMethod - DataInConcatenation (list of Vector) DataOutConcatenatedData (Matrix) ConcatenationConcatenationMethod
Machine Learning (ml) DataSplitting DataSplittingMethod - DataInDataSplittingX (Matrix or Vector)
DataInDataSplittingY (Matrix or Vector)
DataOutSplittedTestDataX (Matrix or Vector)
DataOutSplittedTrainDataY (Matrix or Vector)
DataOutSplittedTrainDataX (Matrix or Vector)
DataOutSplittedTestDataY (Matrix or Vector)
Visualization (visu) CanvasTask CanvasMethod hasCanvasName (string)
hasLayout (string)
- - CanvasTaskCanvasMethod
Visualization (visu) PlotTask LineplotMethod hasLineStyle (string)
hasLineWidth (int)
hasLegendName (string)
DataInVector (Vector) - PlotTaskLineplotMethod
Visualization (visu) PlotTask ScatterplotMethod hasLineStyle (string)
hasLineWidth (int)
hasScatterSize (int)
hasLegendName (string)
DataInVector (Vector) - PlotTaskScatterplotMethod
Statistics (stats) TrendCalculationTask TrendCalculationMethod - DataInTrendCalculation (Vector) DataOutTrendCalculation (Vector) TrendCalculationTaskTrendCalculationMethod
Statistics (stats) NormalizationTask NormalizationMethod - DataInNormalization (Vector) DataOutNormalization (Vector) NormalizationTaskNormalizationMethod
Statistics (stats) ScatteringCalculationTask ScatteringCalculationMethod - DataInScatteringCalculation (Vector) DataOutScatteringCalculation (Vector) ScatteringCalculationTaskScatteringCalculationMethod


Creating an ML pipeline

  • Via code: See the provided examples. To fetch them to your working directory for easy access, run typer exe_kg_lib.cli.main run get-examples.
  • Step-by-step via CLI: Run typer exe_kg_lib.cli.main run create-pipeline.

Executing an ML pipeline

  • Via code: See example code.
  • Via CLI: Run typer exe_kg_lib.cli.main run run-pipeline <pipeline_path>.

Adding a new ML-related task and method

To perform this type of ExeKGLib extension, there are 3 required steps:

  1. Selection of a relevant bottom-level KG schema (Statistics, ML, or Visualization) according to the type of the new task and method.
  2. Addition of new semantic components (entities, properties, etc) to the selected KG schema.
  3. Addition of a Python class to the corresponding module of exe_kg_lib.classes.tasks package.

For steps 2 and 3, refer to the relevant page of ExeKGLib's website.


See the Code Reference and Development sections of the ExeKGLib's website.

External resources

KG schemata

The above KG schemata are included in the ExeKGOntology repository.

Dataset used in code examples

The dataset was generated using the sklearn.datasets.make_classification() function of the scikit-learn Python library.


ExeKGLib is open-sourced under the AGPL-3.0 license. See the LICENSE file for details.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

exe_kg_lib-2.1.1.tar.gz (39.6 kB view hashes)

Uploaded source

Built Distribution

exe_kg_lib-2.1.1-py3-none-any.whl (49.3 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page