Powerful [R2]RML engine to create RDF knowledge graphs from heterogeneous data sources.
Project description
Morph-KGC is an engine that constructs RDF knowledge graphs from heterogeneous data sources with the R2RML and RML mapping languages. Morph-KGC is built on top of pandas and it leverages mapping partitions to significantly reduce execution times and memory consumption for large data sources.
Features :sparkles:
- Supports the R2RML and RML mapping languages.
- Transformation functions with RML-FNML.
- RDF-star generation with RML-star.
- Human-friendly mappings with YARRRML.
- RML views over tabular data sources and JSON files.
- Input data formats:
- Relational databases: MySQL, PostgreSQL, Oracle, Microsoft SQL Server, MariaDB, SQLite.
- Tabular files: CSV, TSV, Excel, Parquet, Feather, ORC, Stata, SAS, SPSS, ODS.
- Hierarchical files: JSON, XML.
- In-memory data structures: Python Dictionaries, DataFrames.
- Integration with RDFLib and Oxigraph.
- Remote data and mapping files.
- Optimized to materialize large knowledge graphs.
- Runs on Linux, Windows and macOS systems.
Documentation :bookmark_tabs:
Tutorial :woman_teacher:
Learn quickly with the tutorial in Google Colaboratory!
Getting Started :rocket:
PyPi is the fastest way to install Morph-KGC:
pip install morph-kgc
We recommend to use virtual environments to install Morph-KGC.
To run the engine via command line you just need to execute the following:
python3 -m morph_kgc config.ini
Check the documentation to see how to generate the configuration INI file. Here you can also see an example INI file.
It is also possible to run Morph-KGC as a library with RDFLib and Oxigraph:
import morph_kgc
# generate the triples and load them to an RDFLib graph
g_rdflib = morph_kgc.materialize('/path/to/config.ini')
# work with the RDFLib graph
q_res = g_rdflib.query('SELECT DISTINCT ?classes WHERE { ?s a ?classes }')
# generate the triples and load them to Oxigraph
g_oxigraph = morph_kgc.materialize_oxigraph('/path/to/config.ini')
# work with Oxigraph
q_res = g_oxigraph.query('SELECT DISTINCT ?classes WHERE { ?s a ?classes }')
# the methods above also accept the config as a string
config = """
[DataSource1]
mappings: /path/to/mapping/mapping_file.rml.ttl
db_url: mysql+pymysql://user:password@localhost:3306/db_name
"""
g_rdflib = morph_kgc.materialize(config)
License :unlock:
Morph-KGC is available under the Apache License 2.0.
Author & Contact :mailbox_with_mail:
Ontology Engineering Group, Universidad Politécnica de Madrid.
Citing :speech_balloon:
If you used Morph-KGC in your work, please cite the SWJ paper:
@article{arenas2022morph,
title = {{Morph-KGC: Scalable knowledge graph materialization with mapping partitions}},
author = {Arenas-Guerrero, Julián and Chaves-Fraga, David and Toledo, Jhon and Pérez, María S. and Corcho, Oscar},
journal = {Semantic Web},
year = {2022},
doi = {10.3233/SW-223135}
}
Contributors :woman_technologist:
See the full list of contributors here.
Sponsor :shield:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for morph_kgc-2.6.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc18dda06a63a2a949868c9e549023ccd8351dd6c7a9d484165882ccc802319d |
|
MD5 | 21a0c82edae1fc2542d3633aa2d5360d |
|
BLAKE2b-256 | 19ca7a4dbc71f7de21a77ecd834106a1bfcb261f6e90bc69995e215392536444 |