Skip to main content

Data management for Bioplatforms Australia projects

Project description

Bioplatforms Australia: CKAN ingest and sync

Usage

Primary usage information is contained in the comments at the top of the ingest/ingest.sh script, which is the gateway to synchronising the archive.

Generating CKAN schemas

bpa-ingest can generate ckanext-scheming schemas.

Usage:

$ bpa-ingest -p /tmp/ingest/ makeschema

Tracking metadata

Two types of tracking metadata are stored within this repository.

Google Drive metadata

The source of truth is "BPA Projects Data Transfer Summary", shared with BPA in Google Drive. This is maintained by the various project managers.

To update, use "File", "Download as", "CSV" within Google Sheets and replace the CSV sheets in track-metadata/google-drive

BPAM metadata

The source of truth is the BPA Metadata app.

To update, export each of the tracking datasets as CSV using the export button, then replace the files in track-metadata/bpam

AWS Lambda

We are gradually adding AWS Lambda functions to this project.

Each Lambda Function will have a handler() function which acts as an entrypoint. These are being collected in bpaingest/handlers/

Lambda functions should load their configuration from S3, from a bucket and key configured via environment variables. This configuration should be configured using AWS KMS. The Lambda function can be granted privileges to decrypt the configuration once it has been read from S3.

To store encrypted data at a key, this pattern works

$ aws kms encrypt --key-id <key> --plaintext fileb://config.json --output text --query CiphertextBlob | base64 --decode > config.enc
$ aws s3 cp config.enc s3://bucket/key

Local development:

For the development environment, you will need to have your local dev environment for bpa-ckan (consider dockercompose-bpa-ckan to do this).

Before you start, ensure you have installed Python 3.7

Bpa-ingest, atm, is just a python virtualenv (on command line),so to initialise a dev working environment:

cd bpa-ingest
git checkout next_release
git pull origin next_release
python3 -m venv ~/.virtual/bpa-ingest
. ~/.virtual/bpa-ingest/bin/activate
pip install -r requirements.txt
python setup.py install
python setup.py develop

Then (ensuring that you are still in python virtual env) source the environment variables (including API key), before running the ingest:

# if not already in virtual env:
. ~/.virtual/bpa-ingest/bin/activate

# source the environment variables
. /path/to/your/bpa.env

# dump the target state of the data portal to a JSON file for one data type
bpa-ingest -p /tmp/dump-metadata/ dumpstate test.json --dump-re 'omg-genomics-ddrad'

Look in /tmp/dump-metadata/ and you will see the working set of metadata sources used by the tool. Remember to delete the contents of /tmp (or subdirectory you are dumping too), when re-running command:

rm -Rf ./tmp/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bpaingest-6.4.2.tar.gz (121.0 kB view details)

Uploaded Source

Built Distribution

bpaingest-6.4.2-py3-none-any.whl (147.7 kB view details)

Uploaded Python 3

File details

Details for the file bpaingest-6.4.2.tar.gz.

File metadata

  • Download URL: bpaingest-6.4.2.tar.gz
  • Upload date:
  • Size: 121.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.8.2 Darwin/19.4.0

File hashes

Hashes for bpaingest-6.4.2.tar.gz
Algorithm Hash digest
SHA256 04bc1d7d2e994808e3b5bd0c3f00cdc5ce81ec14a2d39dbbbb0a4c76b4d62ee4
MD5 cc2ad7a60ba7de006bc20605be6930f0
BLAKE2b-256 79855a8553f7f7eff7347ff1ffe03a3c7e80e50a0bc91231e93110852daaa529

See more details on using hashes here.

File details

Details for the file bpaingest-6.4.2-py3-none-any.whl.

File metadata

  • Download URL: bpaingest-6.4.2-py3-none-any.whl
  • Upload date:
  • Size: 147.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.8.2 Darwin/19.4.0

File hashes

Hashes for bpaingest-6.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 548cdcc887c69399570781fd1d8b9a8179349f5a4a56ba84e27c7f9128da1092
MD5 cd08ee4227db9df045d20ed11c11a79a
BLAKE2b-256 4cd651b6b21741dbf7399a8341ab8501cbd687500d7276c217ce2f471713411f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page