Skip to main content

metadata creation for geospatial data

Project description

Introduction

GeoMetaMaker is a Python library for creating human and machine-readable metadata for geospatial, tabular, and other data formats.

Supported datatypes include:

  • everything supported by GDAL
  • tabular formats supported by frictionless
  • compressed formats supported by frictionless

Installation

mamba install -c conda-forge geometamaker

Basic Usage

This library comes with a command-line interface (CLI) called geometamaker. Many of the examples below show how to use the Python interface, and then how to do the same thing, if possible, using the CLI.

Creating & adding metadata to file:

Python
import geometamaker

# For a vector:
data_path = 'data/watershed_gura.shp'
vector_resource = geometamaker.describe(data_path)

vector_resource.set_title('My Dataset')
vector_resource.set_description('all about my dataset')
vector_resource.set_keywords(['hydrology', 'watersheds'])

vector_resource.set_field_description(
    'field_name',  # the name of an actual field in the vector's table
    description='something about the field',
    units='mm')
vector_resource.write()

# For a raster:
data_path = 'data/dem.tif'
raster_resource = geometamaker.describe(data_path)
raster_resource.set_band_description(
    1,  # a raster band index, starting at 1
    description='something about the band',
    units='mm')
raster_resource.write()

# For a CSV:
data_path = 'data/table.csv'
table_resource = geometamaker.describe(data_path)
table_resource.set_field_description(
    'field_name',  # the name of an actual field in the table
    description='something about the field',
    units='mm')
# A table does not have inherent spatial information, but the
# property may be set manually:
table_resource.set_spatial(raster_resource.spatial)
table_resource.write()

For a complete list of methods and attributes: https://geometamaker.readthedocs.io/en/latest/index.html

CLI
geometamaker describe data/watershed_gura.shp

The CLI does not provide options for setting metadata properties such as keywords, field or band descriptions, or other properties that require user-input. If you create a metadata document with the CLI, you may wish to add these values manually by editing the watershed_gura.shp.yml file in a text editor.

Creating metadata for a collection of files:

Users can create a single metadata document to describe a directory of files, with the option of excluding some files using a regular expression, or limiting the number of subdirectory levels to traverse using the depth or -d flag.

Python

import geometamaker

collection_path = 'data/invest-sample-data'
metadata = geometamaker.describe_collection(collection_path,
                                            depth=2,
                                            exclude_regex=r'.*\.json$',
                                            describe_files=True)
metadata.write()

CLI

geometamaker describe -d 2 --exclude .*\.json$ data/invest-sample-data

These examples will create data/invest-sample-data/invest-sample-data-metadata.yml as well as create individual .yml documents for each dataset within the directory.

Override the default filename of the collection's YML document

geometamaker.describe_collection(collection_path, target_filename='README.yml')

or

geometamaker describe data/invest-sample-data -o README.yml

These examples will create data/invest-sample-data/README.yml.

Validating a metadata document:

If you have manually edited a .yml metadata document, it is a good idea to validate it for correct syntax, properties, and types.

Python
import geometamaker

document_path = 'data/watershed_gura.shp.yml'
error = geometamaker.validate(document_path)
print(error)
CLI
geometamaker validate data/watershed_gura.shp.yml

Validating all metadata documents in a directory:

Python
import geometamaker

directory_path = 'data/'
yaml_files, messages = geometamaker.validate_dir(data)
for filepath, msg in zip(yaml_files, messages):
    print(f'{filepath}: {msg}')
CLI
geometamaker validate data

Configuring default values for metadata properties:

Users can create a "profile" that will apply some common properties to all datasets they describe. Profiles can include contact information and/or license information.

A profile can be saved to a configuration file so that it will be re-used everytime you use geometamaker.

Python
import geometamaker
from geometamaker import models

contact = {
    'individual_name': 'bob'
}
license = {
    'title': 'CC-BY-4'
}

# Two different ways for setting profile attributes:
profile = models.Profile(contact=contact)  # keyword arguments
profile.set_license(**license)             # `set_*` methods

config = geometamaker.Config()
config.save(profile)

# The saved profile will automatically be applied during `describe`:
resource = geometamaker.describe('data/watershed_gura.shp')
CLI
geometamaker config

This will prompt the user to enter their profile information.
Also see geometamaker config --help.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geometamaker-0.3.0.tar.gz (54.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geometamaker-0.3.0-py3-none-any.whl (33.7 kB view details)

Uploaded Python 3

File details

Details for the file geometamaker-0.3.0.tar.gz.

File metadata

  • Download URL: geometamaker-0.3.0.tar.gz
  • Upload date:
  • Size: 54.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for geometamaker-0.3.0.tar.gz
Algorithm Hash digest
SHA256 b99dafdc83e59f8119d5d2c40a046234d2b7387f73593ecdbc7b0d7927bea22a
MD5 b808909fc98bf77b22e194f82a799d27
BLAKE2b-256 ec8085a5fcc78caff6e5a96b3c84b65fbb67fce2e415e15479db5ff6e12786a6

See more details on using hashes here.

File details

Details for the file geometamaker-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: geometamaker-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 33.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for geometamaker-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fce348b21a6d6a2fae27d7a5f94a91c97f68ae16518a835c60b8ee4d9b22dcae
MD5 d27b7325f5ebf05265b03f9b7a60fb05
BLAKE2b-256 ee61483d486e06bfec734609ae1e40d5c18c103d9e46555fea67f01d245bf4f4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page