Powerful [R2]RML engine to create RDF knowledge graphs from heterogeneous data sources.
Project description
Morph-KGC is an engine that constructs RDF knowledge graphs from heterogeneous data sources with the R2RML and RML mapping languages. Morph-KGC is built on top of pandas and it leverages mapping partitions to significantly reduce execution times and memory consumption for large data sources.
Features :sparkles:
- Supports the R2RML and RML mapping languages.
- User-friendly mappings with YARRRML.
- Transformation functions with RML-FNML, including Python user-defined functions.
- RDF-star generation with RML-star.
- RML views over tabular data sources and JSON files.
- Integration with RDFLib, Oxigraph and Kafka.
- Optimized to materialize large knowledge graphs.
- Remote data and mapping files.
- Input data formats:
- Relational databases: MySQL, PostgreSQL, Oracle, Microsoft SQL Server, MariaDB, SQLite.
- Tabular files: CSV, TSV, Excel, Parquet, Feather, ORC, Stata, SAS, SPSS, ODS.
- Hierarchical files: JSON, XML.
- In-memory data structures: Python Dictionaries, DataFrames.
- Cloud data lake solutions: Databricks.
- Property graph databases: Neo4j, Kùzu.
Documentation :bookmark_tabs:
Tutorial :woman_teacher:
Learn quickly with the tutorial in Google Colaboratory!
Getting Started :rocket:
PyPi is the fastest way to install Morph-KGC:
pip install morph-kgc
We recommend to use virtual environments to install Morph-KGC.
To run the engine via command line you just need to execute the following:
python3 -m morph_kgc config.ini
Check the documentation to see how to generate the configuration INI file. Here you can also see an example INI file.
It is also possible to run Morph-KGC as a library with RDFLib and Oxigraph:
import morph_kgc
# generate the triples and load them to an RDFLib graph
g_rdflib = morph_kgc.materialize('/path/to/config.ini')
# work with the RDFLib graph
q_res = g_rdflib.query('SELECT DISTINCT ?classes WHERE { ?s a ?classes }')
# generate the triples and load them to Oxigraph
g_oxigraph = morph_kgc.materialize_oxigraph('/path/to/config.ini')
# work with Oxigraph
q_res = g_oxigraph.query('SELECT DISTINCT ?classes WHERE { ?s a ?classes }')
# the methods above also accept the config as a string
config = """
[DataSource1]
mappings: /path/to/mapping/mapping_file.rml.ttl
db_url: mysql+pymysql://user:password@localhost:3306/db_name
"""
g_rdflib = morph_kgc.materialize(config)
License :unlock:
Morph-KGC is available under the Apache License 2.0.
Author & Contact :mailbox_with_mail:
Ontology Engineering Group, Universidad Politécnica de Madrid.
Citing :speech_balloon:
If you used Morph-KGC in your work, please cite the SWJ paper:
@article{arenas2024morph,
title = {{Morph-KGC: Scalable knowledge graph materialization with mapping partitions}},
author = {Arenas-Guerrero, Julián and Chaves-Fraga, David and Toledo, Jhon and Pérez, María S. and Corcho, Oscar},
journal = {Semantic Web},
publisher = {IOS Press},
issn = {2210-4968},
year = {2024},
doi = {10.3233/SW-223135},
volume = {15},
number = {1},
pages = {1-20}
}
Sponsor :shield:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file morph_kgc-2.8.0.tar.gz
.
File metadata
- Download URL: morph_kgc-2.8.0.tar.gz
- Upload date:
- Size: 218.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d95eeaa01fa9a272444e0d18f04151ff2be769bf9c816228fcb98f3f4e35b3c |
|
MD5 | 776f5d4afdf606f96fdc44ab98c7b2a7 |
|
BLAKE2b-256 | dcfd8abfb6aa0dafd1b06939268d7380df3de36609b373d06e50ac219834fdbf |
File details
Details for the file morph_kgc-2.8.0-py3-none-any.whl
.
File metadata
- Download URL: morph_kgc-2.8.0-py3-none-any.whl
- Upload date:
- Size: 55.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d47624de183db6a78b41e8bae4a0264d6e8f908c17137017ab3435e28a66f46b |
|
MD5 | 6a78b043b44ed7e182132a46203c517c |
|
BLAKE2b-256 | da5337a1044855158daae154f7e41fe58947d37b70e00c3a3dee346dc7d39591 |