Skip to main content

Helper package for irv-datapkg workflow

Project description

Infrastructure Resilience Assessment Data Packages

DOI

Standalone workflow to create national scale open-data packages from global open datasets.

Setup

Get the latest code by cloning this repository:

git clone git@github.com:nismod/irv-datapkg.git

or

git clone https://github.com/nismod/irv-datapkg.git

Install Python and packages - suggest using micromamba:

micromamba create -f environment.yml

Activate the environment:

micromamba activate datapkg

Run

The data packages are produced using a snakemake workflow.

The workflow expects ZENODO_TOKEN to be set as an environment variable - this must be set before running any workflow steps.

If not interacting with Zenodo, this can be a dummy string:

echo "placeholder" > ZENODO_TOKEN

Export from the file to the environment:

export ZENODO_TOKEN=$(cat ZENODO_TOKEN)

Check what will be run, if we ask for everything produced by the rule all, before running the workflow for real:

snakemake --dry-run all

Run the workflow, asking for all, using 8 cores, with verbose log messages:

snakemake --cores 8 --verbose all

Upload and publish

To publish, first create a Zenodo token, save it and export it as the ZENODO_TOKEN environment variable.

Upload a single data package:

snakemake --cores 1 zenodo/GBR.deposited

Publish (cannot be undone) either programmatically:

snakemake --cores 1 zenodo/GBR.published

Or after review online, through the Zenodo website (sandbox, live)

Post-publication

To get a quick list of DOIs from the Zenodo package json:

cat zenodo/*.deposition.json | jq '.metadata.prereserve_doi.doi'

To generate records.csv with details of published packages:

python scripts/published_metadata.py

Development Notes

In case of warnings about GDAL_DATA not being set, try running:

export GDAL_DATA=$(gdal-config --datadir)

To format the workflow definition Snakefile:

snakefmt Snakefile

To format the Python helper scripts:

black scripts

Related work

These Python libraries may be a useful place to start analysis of the data in the packages produced by this workflow:

  • snkit helps clean network data
  • nismod-snail is designed to help implement infrastructure exposure, damage and risk calculations

The open-gira repository contains a larger workflow for global-scale open-data infrastructure risk and resilience analysis.

Acknowledgments

MIT License, Copyright (c) 2023 Tom Russell and irv-datapkg contributors

This research received funding from the FCDO Climate Compatible Growth Programme. The views expressed here do not necessarily reflect the UK government's official policies.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

irv-datapkg-0.1.2.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

irv_datapkg-0.1.2-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file irv-datapkg-0.1.2.tar.gz.

File metadata

  • Download URL: irv-datapkg-0.1.2.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for irv-datapkg-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c029ee8b008ba479402664b6fd5d4ed9e270330cae5702b1f3de2d9ef58e46e4
MD5 a3e649db82df2eac802ddccfec97f38d
BLAKE2b-256 b3698b23170de4d6f6692d92e59698133901ccee1f83bb16b1bcc22764dd8efc

See more details on using hashes here.

File details

Details for the file irv_datapkg-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: irv_datapkg-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for irv_datapkg-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0295dc3eb7e8a317a1c8b20a708dca56ba2b1397d99dc39b297c322ee94b9c47
MD5 3aafafa0e118c6a379247da7883292fc
BLAKE2b-256 5649482f34419fe6fed6aa64c45429cf9d0766ee228a10f88cd01b76c158bfce

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page