Skip to main content

YAML-config-driven MLflow tracking data to RDF knowledge graphs with MLSO ontology alignment

Project description

MLflow to RDF Converter

YAML-config-driven MLflow tracking data to RDF knowledge graphs — aligned with the MLSO ontology.


Overview

mlflow2rdf converts MLflow experiment/run/parameter/metric data into RDF triples using declarative YAML mappings. It supports two modes:

  • RML Standard (recommended): W3C RML-compliant, aligned with MLSea
  • Custom YAML: Simplified configuration for quick use

Project Structure

mlflow-to-rdf/
├── config/
│   ├── rml_mappings.yaml   # RML standard mappings (recommended) ⭐
│   ├── mappings.yaml        # Custom declarative mappings
│   ├── sources.yaml         # MLflow data source config
│   └── validation.yaml      # SHACL validation rules
├── src/
│   ├── converter_rml.py     # RML standard converter ⭐
│   ├── rml_engine.py       # RML engine wrapper
│   ├── data_collector.py   # MLflow data collector
│   ├── converter.py        # Main converter (CLI entry point)
│   ├── engine.py           # Declarative conversion engine
│   ├── validators.py       # SHACL validator
│   └── utils.py            # Utilities
├── docs/
│   ├── PROJECT_DOCUMENTATION.md
│   ├── COMPARISON_ANALYSIS.md
│   └── RML_IMPLEMENTATION_SUMMARY.md
├── tests/
├── examples/
└── data/

Quick Start

Installation

pip install mlflow2rdf

Basic Usage (CLI)

# Point to your MLflow tracking server
mlflow2rdf --mlflow-uri http://localhost:5000 --output output.ttl

Programmatic Usage

from mlflow2rdf import DeclarativeConverter, SHACLValidator

# Initialize converter with YAML configs
converter = DeclarativeConverter(
    sources_config_path='config/sources.yaml',
    mappings_config_path='config/mappings.yaml'
)

# Execute transformation
converter.convert()

# Validate against SHACL shapes
validator = SHACLValidator('config/validation.yaml')
result = validator.validate(converter.graph)

# Save RDF output
converter.save('output.ttl')

Configuration

config/sources.yaml — Data Source

mlflow:
  type: mlflow
  uri: http://localhost:5000   # MLflow tracking server URI
  api_version: 2.0
  extraction:
    experiments: { enabled: true }
    runs: { enabled: true, experiment_ids: ["0"] }
    params: { enabled: true }
    metrics: { enabled: true }
    tags: { enabled: true }

output:
  format: turtle
  path: ./data/output.ttl
  namespaces:
    mlso:   http://example.org/mlso/
    prov:   http://www.w3.org/ns/prov#
    dcterms: http://purl.org/dc/terms/
    rdfs:   http://www.w3.org/2000/01/rdf-schema#

config/mappings.yaml — Declarative Mapping Rules

Maps MLflow entities to MLSO ontology classes:

MLflow Entity MLSO Class
Experiment mlso:Experiment
Run mlso:Run
Parameter mlso:HyperParameterSetting
Metric mlso:Metric
Tag dcterms:hasPart

config/validation.yaml — SHACL Shapes

Defines constraints (minCount, maxCount, datatype, allowed values) for each MLSO class. Run custom SPARQL rules for cross-entity validation.


Features

  • Declarative mappings — all transformation rules in YAML, no code changes needed
  • RML standard support — W3C RML-compliant with YARRRML input
  • MLSO ontology alignment — typed to MLSO vocabulary
  • SHACL validation — ensure RDF output conforms to ontology constraints
  • Multiple RDF serializations — Turtle, N3, JSON-LD, XML
  • CLI and library — use as a tool or import as a package

Documentation

  • README_RML.md — RML standard usage guide
  • docs/PROJECT_DOCUMENTATION.md — Full technical documentation
  • docs/COMPARISON_ANALYSIS.md — Analysis vs. MLSea
  • docs/RML_IMPLEMENTATION_SUMMARY.md — RML implementation summary

Related Resources


Author: Jason Jia
Version: 0.1.1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlflow2rdf-0.1.2.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlflow2rdf-0.1.2-py3-none-any.whl (3.5 kB view details)

Uploaded Python 3

File details

Details for the file mlflow2rdf-0.1.2.tar.gz.

File metadata

  • Download URL: mlflow2rdf-0.1.2.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for mlflow2rdf-0.1.2.tar.gz
Algorithm Hash digest
SHA256 7fa7ee8218d2cc529101726e18189883a87b0ecd4752c14ff21c7e7d20f6b311
MD5 87946d97ff181aeb2aff34162916bc2b
BLAKE2b-256 7b81473f6292b1e00cdfd7e844903c5734267296805afe4a441231f5c859d6ea

See more details on using hashes here.

File details

Details for the file mlflow2rdf-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: mlflow2rdf-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 3.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for mlflow2rdf-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2bd9f80262fc96312c24c84665ca209e1755ace72dc679fbf8c0bcbbd7fa286c
MD5 dacdfebc1a0756681117c290ee3faed3
BLAKE2b-256 8565d647b922460ef98614b07cc7fe44b91a36300f738b9f8fc3c718af690dce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page