Skip to main content

metadata creation for geospatial data

Project description

Introduction

GeoMetaMaker is a Python library for creating human and machine-readable metadata for geospatial, tabular, and other data formats.

Supported datatypes include:

  • everything supported by GDAL
  • tabular formats supported by frictionless
  • compressed formats supported by frictionless

Installation

mamba install -c conda-forge geometamaker

Basic Usage

This library comes with a command-line interface (CLI) called geometamaker. Many of the examples below show how to use the Python interface, and then how to do the same thing, if possible, using the CLI.

Creating & adding metadata to file:

Python
import geometamaker

# For a vector:
data_path = 'data/watershed_gura.shp'
vector_resource = geometamaker.describe(data_path)

vector_resource.set_title('My Dataset')
vector_resource.set_description('all about my dataset')
vector_resource.set_keywords(['hydrology', 'watersheds'])

vector_resource.set_field_description(
    'field_name',  # the name of an actual field in the vector's table
    description='something about the field',
    units='mm')
vector_resource.write()

# For a raster:
data_path = 'data/dem.tif'
raster_resource = geometamaker.describe(data_path)
raster_resource.set_band_description(
    1,  # a raster band index, starting at 1
    description='something about the band',
    units='mm')
raster_resource.write()

# For a CSV:
data_path = 'data/table.csv'
table_resource = geometamaker.describe(data_path)
table_resource.set_field_description(
    'field_name',  # the name of an actual field in the table
    description='something about the field',
    units='mm')
# A table does not have inherent spatial information, but the
# property may be set manually:
table_resource.set_spatial(raster_resource.spatial)
table_resource.write()

For a complete list of methods and attributes: https://geometamaker.readthedocs.io/en/latest/index.html

CLI
geometamaker describe data/watershed_gura.shp

The CLI does not provide options for setting metadata properties such as keywords, field or band descriptions, or other properties that require user-input. If you create a metadata document with the CLI, you may wish to add these values manually by editing the watershed_gura.shp.yml file in a text editor.

Creating metadata for a collection of files:

Users can create a single metadata document to describe a directory of files, with the option of excluding some files using a regular expression, or limiting the number of subdirectory levels to traverse using the depth or -d flag.

Python

import geometamaker

collection_path = 'data/invest-sample-data'
metadata = geometamaker.describe_collection(collection_path,
                                            depth=2,
                                            exclude_regex=r'.*\.json$',
                                            describe_files=True)
metadata.write()

CLI

geometamaker describe -d 2 --exclude .*\.json$ data/invest-sample-data

These examples will create data/invest-sample-data/invest-sample-data-metadata.yml as well as create individual .yml documents for each dataset within the directory.

Override the default filename of the collection's YML document

geometamaker.describe_collection(collection_path, target_filename='README.yml')

or

geometamaker describe data/invest-sample-data -o README.yml

These examples will create data/invest-sample-data/README.yml.

Validating a metadata document:

If you have manually edited a .yml metadata document, it is a good idea to validate it for correct syntax, properties, and types.

Python
import geometamaker

document_path = 'data/watershed_gura.shp.yml'
error = geometamaker.validate(document_path)
print(error)
CLI
geometamaker validate data/watershed_gura.shp.yml

Validating all metadata documents in a directory:

Python
import geometamaker

directory_path = 'data/'
yaml_files, messages = geometamaker.validate_dir(data)
for filepath, msg in zip(yaml_files, messages):
    print(f'{filepath}: {msg}')
CLI
geometamaker validate data

Configuring default values for metadata properties:

Users can create a "profile" that will apply some common properties to all datasets they describe. Profiles can include contact information and/or license information.

A profile can be saved to a configuration file so that it will be re-used everytime you use geometamaker.

Python
import geometamaker
from geometamaker import models

contact = {
    'individual_name': 'bob'
}
license = {
    'title': 'CC-BY-4'
}

# Two different ways for setting profile attributes:
profile = models.Profile(contact=contact)  # keyword arguments
profile.set_license(**license)             # `set_*` methods

config = geometamaker.Config()
config.save(profile)

# The saved profile will automatically be applied during `describe`:
resource = geometamaker.describe('data/watershed_gura.shp')
CLI
geometamaker config

This will prompt the user to enter their profile information.
Also see geometamaker config --help.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geometamaker-0.3.2.tar.gz (54.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geometamaker-0.3.2-py3-none-any.whl (33.7 kB view details)

Uploaded Python 3

File details

Details for the file geometamaker-0.3.2.tar.gz.

File metadata

  • Download URL: geometamaker-0.3.2.tar.gz
  • Upload date:
  • Size: 54.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for geometamaker-0.3.2.tar.gz
Algorithm Hash digest
SHA256 d4e7f16bd40ff469c863272da50c8ef7d315db575b1adaba0b62a2da0b8b78c3
MD5 046b1e0fd690ec958f4aa3c883343d59
BLAKE2b-256 e47ffcb1614d200f52fdd0551b8d9ab9e28e560670d7ac5ef268fc8e7cb58c3a

See more details on using hashes here.

File details

Details for the file geometamaker-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: geometamaker-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 33.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for geometamaker-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fd8302c48feec0e00d0f6cb41cdbdc741806f1eb259bd7e79a7376c41a07d5d9
MD5 7e676fb5fc652175b4f2c855144aea48
BLAKE2b-256 388060c60edfe30fe1b5018d8e2ca6750fe19463fa6a2922e0d302888b5fe189

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page