Skip to main content

fairmeta

Project description

FAIRMeta

PyPI Python License

FAIRMeta is a Python package that aims to automatically publish FAIR-compliant metadata to a FAIR Data Point (FDP) by combining static configuration files with harvested metadata from external platforms. This package was created as an internship project at Radboudumc, as a pilot, and therefore focuses on their FDP and metadata schema. The metadata schema they follow is created by Health-RI.


Install

pip install fairmeta

Architecture

---
config:
  theme: neutral
---
flowchart LR
 subgraph s2["Gatherers"]
        n1(["Grand Challenge"])
        n2(["Other Platforms"])
  end
 subgraph s1["FAIRMeta"]
        s2
        n3["Metadata Schema"]
        n4["Transformation"]
        n5(["Configuration"])
  end
    n3 --> n4
    n5 --> n3
    n4 -- SeMPyRO --> FDP["FDP"]
    s2 --> n3

Features gatherers which gather metadata from a platform which is combined with a configuration that contains the mapping from the platform to the metadata schema and static values. These are combined, transformed to Health-RI compliant metadata, and published to the FDP utilizing SeMPyRO which does additional validation and RDF transformation.

Simple Demo

Using one line in the CLI where platform is a key from the config file:

fairmeta config.yaml platform -s identifier [--test] [-v]

Works for multiple datasets from a single platform:

fairmeta config.yaml platform -s identifier1 -s identifier2

Adding extra metadata can be done with extra yaml files. They are combined with the API data via a shared identifier:

fairmeta config.yaml platform -s identifier -c extra_data.yaml

Functions can of course be used in custom Python scripts. Many functions are @staticmethod because we did not use inheritance in the metadata schema, since properties differ between classes.

# Gather metadata
from fairmeta.gatherers import GrandChallenge

GC_client = GrandChallenge()
slugs = ["identifier"]
GC_client.gather_data(slugs)

# Create metadata
from fairmeta.metadata_model import MetadataRecord

metadata_record = MetadataRecord.from_sources(config=config, 
    api_data=api_data,
)

# Validate & transform metadata
MetadataRecord.transform_values(metadata_record)
metadata_record.validate()

# Publish metadata to FDP
from fairmeta.uploader_radboudfdp import RadboudFDPClient

fdp_client = RadboudFDPClient()
catalog_name = "doesn't matter"
fdp_client.create_and_publish(
    metadata_record,
    catalog_name,
)

Limitations

  • Not all properties in the Health-RI metadata schema are implemented.
  • Combining metadata from multiple platforms into a single dataset is not supported.
  • The metadata schema is defined manually, making it difficult to switch schemas or update the current one automatically.
  • Gathering metadata from Grand Challenge is iffy: since archives are not necessarily linked to challenges in a standard way. Currently the gatherer attempts to access the archive associated with a challenge via the same identifier in lowercase
  • The official FAIR Data Point client is not used due to compatability issues on the testing environment at the Radboudumc FDP. If this issue is resolved it would make sense to replace the RadboudFDPClient with the FAIR Data Point client.

Development

Local testing requires setting up a FAIR Data Point locally.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairmeta-0.0.4.tar.gz (28.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fairmeta-0.0.4-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file fairmeta-0.0.4.tar.gz.

File metadata

  • Download URL: fairmeta-0.0.4.tar.gz
  • Upload date:
  • Size: 28.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fairmeta-0.0.4.tar.gz
Algorithm Hash digest
SHA256 a99fb7ef5ca01f739f922ccc4b9aee57bf82b3068308342e58c1fee7bf04a6bf
MD5 fd1763d0f957debb03aa9239665a4277
BLAKE2b-256 d6581ba2a60c7f60a21f9b9b547e4ac8522339b8cc4d13e098e198d9a86e6c8b

See more details on using hashes here.

File details

Details for the file fairmeta-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: fairmeta-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fairmeta-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 72af32bdc980b9989e8334f419234172b4773638eed15e77dde91675d985cfe0
MD5 3e749ee26981850997e5dfadca570f5b
BLAKE2b-256 21b899615531eda3f3489363298c7034d0af618749493ef98a113445159fe91d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page