DRB Metadata Extractor

Project description

DRB Metadata Extractor

It's an applicative part using DRB allowing to extract metadata from data according its topic.

Metadata

How to extract metadata ?

from drb.factory import DrbFactoryResolver
from drb_metadata import DrbMetadataResolver


if __name__ == '__main__':
    node = DrbFactoryResolver().create('<my_resource_url>')
    metadata = DrbMetadataResolver().get_metadata(node)
    for md_name, md in metadata.items():
        print(md_name, ' -- ', md.extract(node))

How to define metadata ?

Metadata are defined in a cortex.yaml file following the template:

drbItemClass: <topic_uuid>           # target topic
variables:                           # variable list
  - name: <var_name>                   # variable name
    <extractor>: <extractor_content>   # an extractor
metadata:                            # metadata list
  - name: my_metadata                  # metadata name
    <extractor>: <extractor_content>   # an extractor

metadata are applied to their target topic and its derivatives
inherited metadata is override if it's redefined in a derivative topic
variables are not transitive between a topic and its derivatives

Extractor

An extractor as its name suggests allowing to extract information/data from a node. An extractor is defined by a YAML content. Three extractor types exists currently:

Constant

This extractor nothing from the node but give always the same value.

constant: 42

Some string values are automatically converted to a specific Python type:

Value	Python type
2022-01-01	datatime.date
2022-01-01T00:00:00.000Z	datatime.datetime

XQuery

This extractor allowing to extract data from the node via an XQuery script. See more details about XQuery.

xquery: |
  data(./manifest.safe/XFDU/metadataSection/
  metadataObject[@ID="generalProductInformation"]/metadataWrap/xmlData/
    *[matches(name(),"standAloneProductInformation|generalProductInformation")]/
    noiseCompressionType)

Python

The Python extractor allowing to extract data from a node via a Python script. Where the node variable represents the current node.

python: |
  return node['DATASTRIP'][0]['MTD_DS.xml']['Level-1C_DataStrip_ID']
      ['General_Info']['Datatake_Info'].get_attribute('datatakeIdentifier')

example:

drbItemClass: aff2191f-5b06-4121-a9fa-f3d93f6c6331
variables:
  - name: node_platform
    xquery: |
      ./manifest.safe/XFDU/metadataSection/metadataObject[@ID="platform"]/
        metadataWrap/xmlData/platform
metadata:
  - name: 'platformName'
    constant: 'Sentinel-1'
  - name: 'SatelliteNumber'
    xquery: |
      declare variable $node_platform external;
      data($node_platform/number)
  - name: 'platformIdentifier'
    python: |
      return node_platform['nssdcIdentifier'].value
  - name: 'resolutionDetail'
    python: |
      resolution = node.name[10:11]
      if resolution == 'F':
        return 'Full'
      elif resolution == 'H':
        return 'High'
      elif resolution == 'M':
        return 'Medium'
      return None

Packaging

The package python containing metadata of a DRB topic must have the following instruction:

a drb.metadata entry point whose its value is the targeted Python package containing the cortex.yaml file

Project details

Release history Release notifications | RSS feed

1.3.2

Oct 9, 2023

1.3.1

Sep 29, 2023

1.3.0

Sep 28, 2023

1.2.1

Jun 30, 2023

1.2.0

May 16, 2023

1.1.2

Mar 9, 2023

1.1.1

Feb 2, 2023

1.1.0

Jan 3, 2023

This version

1.0.1

Sep 19, 2022

1.0.0

Jun 13, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drb-metadata-1.0.1.tar.gz (24.9 kB view hashes)

Uploaded Sep 19, 2022 Source

Built Distribution

drb_metadata-1.0.1-py3-none-any.whl (7.6 kB view hashes)

Uploaded Sep 19, 2022 Python 3

Hashes for drb-metadata-1.0.1.tar.gz

Hashes for drb-metadata-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`3e59e4362274a7cbdf73d85bf7a88ad9fb6df3d6caa06cb51740b4d5e83eca5a`
MD5	`6f7c0da27ac88acf1d08ff583b576af5`
BLAKE2b-256	`c2deef11f2d6d07e125158bc5bf857d44f4f367bbaf042f2231262f5364357aa`

Hashes for drb_metadata-1.0.1-py3-none-any.whl

Hashes for drb_metadata-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5a9075522e6f74e0f9ffc6fcf54fa11eb54766b1e2559d101fbbdfc536ca3e59`
MD5	`d634ae25337709a963d36c3c1bb53fd6`
BLAKE2b-256	`6f296d74d179784357e381c7e33647661474854e51a365d9b98428044dd89031`