a helper to ingest data in sdap

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

SDAP manager for ingestion of datasets

Prerequisites

python 3

Install anaconda for python 3. From the graphic install for example for macos:

https://www.anaconda.com/distribution/#macos

git lfs (for development)

Git lfs for the deployment from git, see https://git-lfs.github.com/

If not available you have to get netcdf files for test, if you do need the tests.

Deployed nexus on kubernetes cluster

See project https://github.com/apache/incubator-sdap-nexus

$ helm install nexus .  --namespace=sdap --dependency-update -f ~/overridden-nexus-values.yml

Install, Configure and run

Install

Stay logged in as user

$ pip install sdap-ingest-manager

Configure the ingestion system

Catch the message at the end of the installation output

--------------------------------------------------------------
Now, create configuration files in
***/<some path>>/.sdap_ingest_manager***
 Use templates and examples provided there
--------------------------------------------------------------

If the path does not show in the installation stdout, you can find it with the command:

python -c "import sys; print(f'{sys.prefix}/.sdap_ingest_manager')"

Use the path shown in the message and create your own configuration files:

$ cd /<some path>>/.sdap_ingest_manager
$ cp sdap_ingest_manager.ini.default sdap_ingest_manager.ini

Edit and update the newly created files by following instructions in the comments.

Note that the .ini.default file will be used if no value is configured in the .ini file. So you can have a simplified .ini file with only your specific configuration. Don't put your specific configuration in the .ini.default file, it will be replaced when you upgrade the package.

Example of a simplified .ini file:

[COLLECTIONS_YAML_CONFIG]
yaml_file = collections.yml

[OPTIONS]
# set to False to actually call the ingestion command for each granule
dry_run = False
# set to True to automatically list the granules as seen on the nfs server when they are mounted on the local file system.
deconstruct_nfs = True
# number of parallel ingestion pods on kubernetes (1 per granule)
parallel_pods = 2

[INGEST]
# kubernetes namespace where the sdap cluster is deployed
kubernetes_namespace = nexus-dev

Configure the collections

You can configure it in a local yaml file referenced in the sdap_ingest_manager.ini file.

It can also be in a google spreadsheet.

If both are configured, the local yaml file will be used.

Run the ingestion

On the list of the configured collections:

$ run_collections

The number of parallel jobs can be updated during the process in the sdap_ingest_manager.ini file.

If interrupted (killed) the process will restart where it was.

For developers

deploy project

$ bash
$ git clone ...
$ cd sdap_ingest_manager
$ python -m venv venv
$ source ./venv/bin/activate
$ pip install .

Note the command pip install -e . does not work as it does not deploy the configuration files.

Update the project

Update the code and the test with your favorite IDE (e.g. pyCharm).

Test and create the package

Change version in file setup.py

$ python setup.py test
$ git tag <version>
$ git push origin <version>

The release will be automatically pushed to pypi though github action.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.3.0rc3 pre-release

May 5, 2020

0.3.0rc2 pre-release

Apr 28, 2020

0.3.0rc1 pre-release

Apr 24, 2020

This version

0.2.0rc3 pre-release

Apr 7, 2020

0.2.0rc2 pre-release

Mar 25, 2020

0.2.0rc1 pre-release

Mar 25, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdap_ingest_manager-0.2.0rc3.tar.gz (1.9 MB view hashes)

Uploaded Apr 7, 2020 Source

Built Distribution

sdap_ingest_manager-0.2.0rc3-py3-none-any.whl (3.9 MB view hashes)

Uploaded Apr 7, 2020 Python 3

Hashes for sdap_ingest_manager-0.2.0rc3.tar.gz

Hashes for sdap_ingest_manager-0.2.0rc3.tar.gz
Algorithm	Hash digest
SHA256	`2a53104cda5885e18c9bb22dd0508ed8279ccfbd2a638faa077830b58a69a30a`
MD5	`be32cebff975b0c70462eabb4cba21eb`
BLAKE2b-256	`545805eac1ac3e7a1216a92c55b15c062ae12e99ff2f72dce2e51e1b724a93e7`

Hashes for sdap_ingest_manager-0.2.0rc3-py3-none-any.whl

Hashes for sdap_ingest_manager-0.2.0rc3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7cd37786aeb403b00404e3dae22349971f5bdf2f57ac8397109ea9436010f2f6`
MD5	`9803805e7fe305c7b96312be2d18ef17`
BLAKE2b-256	`e40fd62d7af55aa2a258d1ce4569e4e907046877abee11dcda1e6d2e715452a3`