Skip to main content

Helper package for irv-datapkg workflow

Project description

Infrastructure Resilience Assessment Data Packages

DOI

Standalone workflow to create national scale open-data packages from global open datasets.

Setup

Get the latest code by cloning this repository:

git clone git@github.com:nismod/irv-datapkg.git

or

git clone https://github.com/nismod/irv-datapkg.git

Install Python and packages - suggest using micromamba:

micromamba create -f environment.yml

Activate the environment:

micromamba activate datapkg

Run

The data packages are produced using a snakemake workflow.

The workflow expects ZENODO_TOKEN, CDSAPI_KEY and CDSAPI_URL to be set as environment variables - these must be set before running any workflow steps.

If not interacting with Zenodo or the Copernicus Climate Data Store, these can be dummy strings:

echo "placeholder" > ZENODO_TOKEN
echo "https://cds-beta.climate.copernicus.eu/api" > CDSAPI_URL
echo "test" > CDSAPI_KEY

See Climate Data Store API docs and Zenodo API docs for access details.

Export from the file to the environment:

export ZENODO_TOKEN=$(cat ZENODO_TOKEN)
export CDSAPI_KEY=$(cat CDSAPI_KEY)
export CDSAPI_URL=$(cat CDSAPI_URL)

Check what will be run, if we ask for everything produced by the rule all, before running the workflow for real:

snakemake --dry-run all

Run the workflow, asking for all, using 8 cores, with verbose log messages:

snakemake --cores 8 --verbose all

Upload and publish

To publish, first create a Zenodo token, save it and export it as the ZENODO_TOKEN environment variable.

Upload a single data package:

snakemake --cores 1 zenodo/GBR.deposited

Publish (cannot be undone) either programmatically:

snakemake --cores 1 zenodo/GBR.published

Or after review online, through the Zenodo website (sandbox, live)

Post-publication

To get a quick list of DOIs from the Zenodo package json:

cat zenodo/*.deposition.json | jq '.metadata.prereserve_doi.doi'

To generate records.csv with details of published packages:

python scripts/published_metadata.py

Development Notes

In case of warnings about GDAL_DATA not being set, try running:

export GDAL_DATA=$(gdal-config --datadir)

To format the workflow definition Snakefile:

snakefmt Snakefile

To format the Python helper scripts:

black scripts

Related work

These Python libraries may be a useful place to start analysis of the data in the packages produced by this workflow:

  • snkit helps clean network data
  • nismod-snail is designed to help implement infrastructure exposure, damage and risk calculations

The open-gira repository contains a larger workflow for global-scale open-data infrastructure risk and resilience analysis.

Acknowledgments

MIT License, Copyright (c) 2023 Tom Russell and irv-datapkg contributors

This research received funding from the FCDO Climate Compatible Growth Programme. The views expressed here do not necessarily reflect the UK government's official policies.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

irv_datapkg-0.2.1.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

irv_datapkg-0.2.1-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file irv_datapkg-0.2.1.tar.gz.

File metadata

  • Download URL: irv_datapkg-0.2.1.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for irv_datapkg-0.2.1.tar.gz
Algorithm Hash digest
SHA256 e042dc4334b1532e4e3dc84b51e86014284ef25e1234ae93fb4b9a473c79d69c
MD5 507b9ba2751ac6d8c211992355bbbca5
BLAKE2b-256 7eec7ac3ccf0e301cf8d3387bfa01a466766530b42df13a4ab454b2d41324dfd

See more details on using hashes here.

Provenance

The following attestation bundles were made for irv_datapkg-0.2.1.tar.gz:

Publisher: publish.yml on nismod/irv-datapkg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file irv_datapkg-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: irv_datapkg-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for irv_datapkg-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2f3dd0b1fb44e1603a68594d27e7e830fd85d87f9db0184dc6f7db168fa9a625
MD5 37736f0b26a5d2468e76dc6c1c2cd0bf
BLAKE2b-256 1913da19685b37034db69a12c7e4e56063990f8d870c72b84e716a3a78feef5a

See more details on using hashes here.

Provenance

The following attestation bundles were made for irv_datapkg-0.2.1-py3-none-any.whl:

Publisher: publish.yml on nismod/irv-datapkg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page