YAML-config-driven MLflow tracking data to RDF knowledge graphs with MLSO ontology alignment
Project description
MLflow to RDF Converter
YAML-config-driven MLflow tracking data to RDF knowledge graphs — aligned with the MLSO ontology.
Overview
mlflow2rdf converts MLflow experiment/run/parameter/metric data into RDF triples using declarative YAML mappings. It supports two modes:
- RML Standard (recommended): W3C RML-compliant, aligned with MLSea
- Custom YAML: Simplified configuration for quick use
Project Structure
mlflow-to-rdf/
├── config/
│ ├── rml_mappings.yaml # RML standard mappings (recommended) ⭐
│ ├── mappings.yaml # Custom declarative mappings
│ ├── sources.yaml # MLflow data source config
│ └── validation.yaml # SHACL validation rules
├── src/
│ ├── converter_rml.py # RML standard converter ⭐
│ ├── rml_engine.py # RML engine wrapper
│ ├── data_collector.py # MLflow data collector
│ ├── converter.py # Main converter (CLI entry point)
│ ├── engine.py # Declarative conversion engine
│ ├── validators.py # SHACL validator
│ └── utils.py # Utilities
├── docs/
│ ├── PROJECT_DOCUMENTATION.md
│ ├── COMPARISON_ANALYSIS.md
│ └── RML_IMPLEMENTATION_SUMMARY.md
├── tests/
├── examples/
└── data/
Quick Start
Installation
pip install mlflow2rdf
Basic Usage (CLI)
# Point to your MLflow tracking server
mlflow2rdf --mlflow-uri http://localhost:5000 --output output.ttl
Programmatic Usage
from mlflow2rdf import DeclarativeConverter, SHACLValidator
# Initialize converter with YAML configs
converter = DeclarativeConverter(
sources_config_path='config/sources.yaml',
mappings_config_path='config/mappings.yaml'
)
# Execute transformation
converter.convert()
# Validate against SHACL shapes
validator = SHACLValidator('config/validation.yaml')
result = validator.validate(converter.graph)
# Save RDF output
converter.save('output.ttl')
Configuration
config/sources.yaml — Data Source
mlflow:
type: mlflow
uri: http://localhost:5000 # MLflow tracking server URI
api_version: 2.0
extraction:
experiments: { enabled: true }
runs: { enabled: true, experiment_ids: ["0"] }
params: { enabled: true }
metrics: { enabled: true }
tags: { enabled: true }
output:
format: turtle
path: ./data/output.ttl
namespaces:
mlso: http://example.org/mlso/
prov: http://www.w3.org/ns/prov#
dcterms: http://purl.org/dc/terms/
rdfs: http://www.w3.org/2000/01/rdf-schema#
config/mappings.yaml — Declarative Mapping Rules
Maps MLflow entities to MLSO ontology classes:
| MLflow Entity | MLSO Class |
|---|---|
| Experiment | mlso:Experiment |
| Run | mlso:Run |
| Parameter | mlso:HyperParameterSetting |
| Metric | mlso:Metric |
| Tag | dcterms:hasPart |
config/validation.yaml — SHACL Shapes
Defines constraints (minCount, maxCount, datatype, allowed values) for each MLSO class. Run custom SPARQL rules for cross-entity validation.
Features
- ✅ Declarative mappings — all transformation rules in YAML, no code changes needed
- ✅ RML standard support — W3C RML-compliant with YARRRML input
- ✅ MLSO ontology alignment — typed to MLSO vocabulary
- ✅ SHACL validation — ensure RDF output conforms to ontology constraints
- ✅ Multiple RDF serializations — Turtle, N3, JSON-LD, XML
- ✅ CLI and library — use as a tool or import as a package
Documentation
README_RML.md— RML standard usage guidedocs/PROJECT_DOCUMENTATION.md— Full technical documentationdocs/COMPARISON_ANALYSIS.md— Analysis vs. MLSeadocs/RML_IMPLEMENTATION_SUMMARY.md— RML implementation summary
Related Resources
- MLSea Paper: "MLSea: A Semantic Layer for Discoverable Machine Learning"
- MLSea GitHub: https://github.com/dtai-kg/MLSO
- RML Spec: https://rml.io/specs/rml/
- YARRRML Spec: https://rml.io/yarrrml/spec/
Author: Jason Jia
Version: 0.1.1
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlflow2rdf-0.1.2.tar.gz.
File metadata
- Download URL: mlflow2rdf-0.1.2.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7fa7ee8218d2cc529101726e18189883a87b0ecd4752c14ff21c7e7d20f6b311
|
|
| MD5 |
87946d97ff181aeb2aff34162916bc2b
|
|
| BLAKE2b-256 |
7b81473f6292b1e00cdfd7e844903c5734267296805afe4a441231f5c859d6ea
|
File details
Details for the file mlflow2rdf-0.1.2-py3-none-any.whl.
File metadata
- Download URL: mlflow2rdf-0.1.2-py3-none-any.whl
- Upload date:
- Size: 3.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2bd9f80262fc96312c24c84665ca209e1755ace72dc679fbf8c0bcbbd7fa286c
|
|
| MD5 |
dacdfebc1a0756681117c290ee3faed3
|
|
| BLAKE2b-256 |
8565d647b922460ef98614b07cc7fe44b91a36300f738b9f8fc3c718af690dce
|