Skip to main content

metadata creation for geospatial data

Project description

Introduction

GeoMetaMaker is a Python library for creating human and machine-readable metadata for geospatial, tabular, and other data formats.

Supported datatypes include:

  • everything supported by GDAL
  • tabular formats supported by frictionless
  • compressed formats supported by frictionless

Installation

mamba install -c conda-forge geometamaker

Basic Usage

This library comes with a command-line interface (CLI) called geometamaker. Many of the examples below show how to use the Python interface, and then how to do the same thing, if possible, using the CLI.

Creating & adding metadata to file:

Python
import geometamaker

# For a vector:
data_path = 'data/watershed_gura.shp'
vector_resource = geometamaker.describe(data_path)

vector_resource.set_title('My Dataset')
vector_resource.set_description('all about my dataset')
vector_resource.set_keywords(['hydrology', 'watersheds'])

vector_resource.set_field_description(
    'field_name',  # the name of an actual field in the vector's table
    description='something about the field',
    units='mm')
vector_resource.write()

# For a raster:
data_path = 'data/dem.tif'
raster_resource = geometamaker.describe(data_path)
raster_resource.set_band_description(
    1,  # a raster band index, starting at 1
    description='something about the band',
    units='mm')
raster_resource.write()

# For a CSV:
data_path = 'data/table.csv'
table_resource = geometamaker.describe(data_path)
table_resource.set_field_description(
    'field_name',  # the name of an actual field in the table
    description='something about the field',
    units='mm')
# A table does not have inherent spatial information, but the
# property may be set manually:
table_resource.set_spatial(raster_resource.spatial)
table_resource.write()

For a complete list of methods and attributes: https://geometamaker.readthedocs.io/en/latest/index.html

CLI
geometamaker describe data/watershed_gura.shp

The CLI does not provide options for setting metadata properties such as keywords, field or band descriptions, or other properties that require user-input. If you create a metadata document with the CLI, you may wish to add these values manually by editing the watershed_gura.shp.yml file in a text editor.

Creating metadata for a collection of files:

Users can create a single metadata document to describe a directory of files, with the option of excluding some files using a regular expression, or limiting the number of subdirectory levels to traverse using the depth or -d flag.

Python

import geometamaker

collection_path = 'data/invest-sample-data'
metadata = geometamaker.describe_collection(collection_path,
                                            depth=2,
                                            exclude_regex=r'.*\.json$',
                                            describe_files=True)
metadata.write()

CLI

geometamaker describe -d 2 --exclude .*\.json$ data/invest-sample-data

These examples will create data/invest-sample-data/invest-sample-data-metadata.yml as well as create individual .yml documents for each dataset within the directory.

Override the default filename of the collection's YML document

geometamaker.describe_collection(collection_path, target_filename='README.yml')

or

geometamaker describe data/invest-sample-data -o README.yml

These examples will create data/invest-sample-data/README.yml.

Validating a metadata document:

If you have manually edited a .yml metadata document, it is a good idea to validate it for correct syntax, properties, and types.

Python
import geometamaker

document_path = 'data/watershed_gura.shp.yml'
error = geometamaker.validate(document_path)
print(error)
CLI
geometamaker validate data/watershed_gura.shp.yml

Validating all metadata documents in a directory:

Python
import geometamaker

directory_path = 'data/'
yaml_files, messages = geometamaker.validate_dir(data)
for filepath, msg in zip(yaml_files, messages):
    print(f'{filepath}: {msg}')
CLI
geometamaker validate data

Configuring default values for metadata properties:

Users can create a "profile" that will apply some common properties to all datasets they describe. Profiles can include contact information and/or license information.

A profile can be saved to a configuration file so that it will be re-used everytime you use geometamaker.

Python
import geometamaker
from geometamaker import models

contact = {
    'individual_name': 'bob'
}
license = {
    'title': 'CC-BY-4'
}

# Two different ways for setting profile attributes:
profile = models.Profile(contact=contact)  # keyword arguments
profile.set_license(**license)             # `set_*` methods

config = geometamaker.Config()
config.save(profile)

# The saved profile will automatically be applied during `describe`:
resource = geometamaker.describe('data/watershed_gura.shp')
CLI
geometamaker config

This will prompt the user to enter their profile information.
Also see geometamaker config --help.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geometamaker-0.3.1.tar.gz (54.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geometamaker-0.3.1-py3-none-any.whl (33.8 kB view details)

Uploaded Python 3

File details

Details for the file geometamaker-0.3.1.tar.gz.

File metadata

  • Download URL: geometamaker-0.3.1.tar.gz
  • Upload date:
  • Size: 54.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for geometamaker-0.3.1.tar.gz
Algorithm Hash digest
SHA256 f604739be6ed6affffe02209c7534d2076431284a7b5bb169bb876da55dbef07
MD5 78f5c8b62a8d2f99e75674fd1f574f13
BLAKE2b-256 2ebad6ee435a6634116ea0312f8c603c1f916952a10efd221b4fdb359f618335

See more details on using hashes here.

File details

Details for the file geometamaker-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: geometamaker-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 33.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for geometamaker-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 14d839994df11e5cc2bb6eba7637e9be998e994010fdbaf0d877c48923787912
MD5 7c1b7c6803eedfa91f63d01818bd4542
BLAKE2b-256 c718eee3098d982d3adcd22d94e240af49454e8f5c6665b87c57994e6ca04c25

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page