Data management for Bioplatforms Australia projects
Project description
Bioplatforms Australia: CKAN ingest and sync
Usage
Primary usage information is contained in the comments at the
top of the ingest/ingest.sh
script, which is the gateway
to synchronising the archive.
Generating CKAN schemas
bpa-ingest
can generate ckanext-scheming
schemas.
Usage:
$ bpa-ingest -p /tmp/ingest/ makeschema
Tracking metadata
Two types of tracking metadata are stored within this repository.
Google Drive metadata
The source of truth is "BPA Projects Data Transfer Summary", shared with BPA in Google Drive. This is maintained by the various project managers.
To update, use "File", "Download as", "CSV" within Google Sheets
and replace the CSV sheets in track-metadata/google-drive
BPAM metadata
The source of truth is the BPA Metadata app.
To update, export each of the tracking datasets as CSV using the
export button, then replace the files in track-metadata/bpam
- https://data.bioplatforms.com/bpa/adminsepsis/genomicsmiseqtrack/
- https://data.bioplatforms.com/bpa/adminsepsis/genomicspacbiotrack/
- https://data.bioplatforms.com/bpa/adminsepsis/metabolomicslcmstrack/
- https://data.bioplatforms.com/bpa/adminsepsis/proteomicsms1quantificationtrack/
- https://data.bioplatforms.com/bpa/adminsepsis/proteomicsswathmstrack/
- https://data.bioplatforms.com/bpa/adminsepsis/transcriptomicshiseqtrack/
AWS Lambda
We are gradually adding AWS Lambda functions to this project.
Each Lambda Function will have a handler()
function which acts as an
entrypoint. These are being collected in bpaingest/handlers/
Lambda functions should load their configuration from S3, from a bucket and key configured via environment variables. This configuration should be configured using AWS KMS. The Lambda function can be granted privileges to decrypt the configuration once it has been read from S3.
To store encrypted data at a key, this pattern works
$ aws kms encrypt --key-id <key> --plaintext fileb://config.json --output text --query CiphertextBlob | base64 --decode > config.enc
$ aws s3 cp config.enc s3://bucket/key
Local development:
For the development environment, you will need to have your local dev environment for bpa-ckan (consider dockercompose-bpa-ckan to do this).
Before you start, ensure you have installed Python 3.7
Bpa-ingest, atm, is just a python virtualenv (on command line),so to initialise a dev working environment:
cd bpa-ingest
git checkout next_release
git pull origin next_release
python3 -m venv ~/.virtual/bpa-ingest
. ~/.virtual/bpa-ingest/bin/activate
pip install -r requirements.txt
python setup.py install
python setup.py develop
Then (ensuring that you are still in python virtual env) source the environment variables (including API key), before running the ingest:
# if not already in virtual env:
. ~/.virtual/bpa-ingest/bin/activate
# source the environment variables
. /path/to/your/bpa.env
# dump the target state of the data portal to a JSON file for one data type
bpa-ingest -p /tmp/dump-metadata/ dumpstate test.json --dump-re 'omg-genomics-ddrad'
Look in /tmp/dump-metadata/ and you will see the working set of metadata sources used by the tool. Remember to delete the contents of /tmp (or subdirectory you are dumping too), when re-running command:
rm -Rf ./tmp/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bpaingest-6.4.2.tar.gz
.
File metadata
- Download URL: bpaingest-6.4.2.tar.gz
- Upload date:
- Size: 121.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.5 CPython/3.8.2 Darwin/19.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04bc1d7d2e994808e3b5bd0c3f00cdc5ce81ec14a2d39dbbbb0a4c76b4d62ee4 |
|
MD5 | cc2ad7a60ba7de006bc20605be6930f0 |
|
BLAKE2b-256 | 79855a8553f7f7eff7347ff1ffe03a3c7e80e50a0bc91231e93110852daaa529 |
File details
Details for the file bpaingest-6.4.2-py3-none-any.whl
.
File metadata
- Download URL: bpaingest-6.4.2-py3-none-any.whl
- Upload date:
- Size: 147.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.5 CPython/3.8.2 Darwin/19.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 548cdcc887c69399570781fd1d8b9a8179349f5a4a56ba84e27c7f9128da1092 |
|
MD5 | cd08ee4227db9df045d20ed11c11a79a |
|
BLAKE2b-256 | 4cd651b6b21741dbf7399a8341ab8501cbd687500d7276c217ce2f471713411f |