Skip to main content

Toolbox for creating/assessing EMSO-compliant NetCDF datasets and integrate them into ERDDAP services

Project description

Metadata Harmonizer Toolbox

This repository contains a set of tools that can be used to create NetCDF files, integrate them into an ERDDAP server and to ensure the compliance with the EMSO Metadata Specifications. The tools provided here are:

  • emh.generate_dataset(): creates EMSO-compliant NetCDF files from .csv and .yaml files
  • emh.erddap_config(): integrates NetCDF files into an ERDDAP server
  • emh.metadata_report(): check the compliance of a dataset with the specifications.

In order to create and publish an EMSO-compliant dataset, the typical workflow is:

  1. Prepare CSV data and YAML metadata
  2. Generate EMSO-compliant NetCDF files using generate_dataset()
  3. Integrate datasets into your ERDDAP deployment using erddap_config()
  4. Validate metadata and operational compliance using metadata_report()

Installation

To install as a PyPi package:

pip3 install emso_metadata_harmonizer

🛠 NetCDF Generator

To generate a NetCDF dataset from data (csv) and metadata (yaml) files:

import emso_metadata_harmonizer as emh

emh.generate_dataset(["data.csv"], ["meta.yaml"], output="dataset.nc")

Full example with data and metadata from the example 2

import emso_metadata_harmonizer as emh
import urllib

# Download data and metadata from the example 2 in the metadata-harmonizer repository
data_url = "https://raw.githubusercontent.com/emso-eric/metadata-harmonizer/refs/heads/develop/examples/02/SBE16.csv"
meta_url = "https://raw.githubusercontent.com/emso-eric/metadata-harmonizer/refs/heads/develop/examples/02/meta.yaml"
urllib.request.urlretrieve(data_url, "data.csv")
urllib.request.urlretrieve(meta_url, "meta.yaml")

# Generate dataset from one data file
emh.generate_dataset(["data.csv"], ["meta.yaml"], "dataset.nc")

To generate a dataset from multiple data files:

import emso_metadata_harmonizer as emh
import urllib

# Generate dataset from multiple data files
data1_url = "https://raw.githubusercontent.com/emso-eric/metadata-harmonizer/refs/heads/develop/examples/02/SBE16.csv"
data2_url = "https://raw.githubusercontent.com/emso-eric/metadata-harmonizer/refs/heads/develop/examples/02/SBE37.csv"
meta_url = "https://raw.githubusercontent.com/emso-eric/metadata-harmonizer/refs/heads/develop/examples/02/meta.yaml"
urllib.request.urlretrieve(data1_url, "data1.csv")
urllib.request.urlretrieve(data2_url, "data2.csv")
urllib.request.urlretrieve(meta_url, "meta.yaml")

emh.generate_dataset(["data1.csv", "data2.csv"], ["meta.yaml"], "dataset2.nc")

⚙️ ERDDAP Configurator

The ERDDAP Configurator (erddap_config()) helps prepare ERDDAP dataset definitions for NetCDF files, reducing manual work editing ERDDAP’s XML configurations. It reads NetCDF metadata and generates XML chunk required to register a new dataset.

import emso_metadata_harmonizer as emh

emh.erddap_config("dataset.nc", "MyDatasetIdentifier", "/path/to/dataset/files")

To automatically append a new dataset into an existing ERDDAP deployment, the path to the datasets.xml file should be passed via the datasets_xml_file parameter.

import emso_metadata_harmonizer as emh

emh.erddap_config("dataset.nc", "MyDatasetIdentifier", "/path/to/dataset/files", datasets_xml_file="path/to/datasets.xml")

📈 Metadata Report

The metadata reporting tool assesses the level of compliance of an ERDDAP or NetCDF dataset with the EMSO Metadata Specifications. To test a dataset, use the following syntax:

import emso_metadata_harmonizer as emh
emh.metadata_report("dataset.nc")

Logging

To control the verbosity of the logging messages:

import logging
logging.getLogger("emso_metadata_harmonizer").setLevel(logging.WARN)

Where WARN is the level of logging messages. Check the Python logging documentation for more information.

Contact info

  • author: Enoc Martínez
  • version: v1.0.4
  • organization: Universitat Politècnica de Catalunya (UPC)
  • contact: enoc.martinez@upc.edu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emso_metadata_harmonizer-1.0.4.tar.gz (62.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

emso_metadata_harmonizer-1.0.4-py3-none-any.whl (61.9 kB view details)

Uploaded Python 3

File details

Details for the file emso_metadata_harmonizer-1.0.4.tar.gz.

File metadata

  • Download URL: emso_metadata_harmonizer-1.0.4.tar.gz
  • Upload date:
  • Size: 62.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for emso_metadata_harmonizer-1.0.4.tar.gz
Algorithm Hash digest
SHA256 e18be28af31cc1b74bc221e1653cd5d809c0ecf78d38e446a1af5da2a285afcd
MD5 1a86adb39a39bf8ec09014fe392a5a09
BLAKE2b-256 b1cd5cd91bf7981ad92fc8aefd722c52d83caf957c5308fd1055a5ea07c516bc

See more details on using hashes here.

File details

Details for the file emso_metadata_harmonizer-1.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for emso_metadata_harmonizer-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 eb3d142463f612c8e84f9a4a3607cbe4825e993a5ee81e585b265fe14ad29df9
MD5 03394552f65a4abb5759b7e38a39e7b8
BLAKE2b-256 c52c0dbf2aa6e03d0e5150a83af0ef3390659a11e386ca00349f5cbf94600732

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page