Skip to main content

A package to manage Google Cloud Data Catalog custom entries

Project description

datacatalog-custom-entries-manager

Python package to manage Google Cloud Data Catalog custom entries, loading metadata from external sources. Currently supports the CSV and JSON file formats.

Continuous Integration

Table of Contents

1. Environment setup

1.1. Python + virtualenv

Using virtualenv is optional, but strongly recommended unless you use Docker.

1.1.1. Install Python 3.6+

1.1.2. Create a folder

This is recommended so all related stuff will reside at the same place, making it easier to follow below instructions.

mkdir ./datacatalog-custom-entries-manager
cd ./datacatalog-custom-entries-manager

All paths starting with ./ in the next steps are relative to the datacatalog-custom-entries-manager folder.

1.1.3. Create and activate an isolated Python environment

pip install --upgrade virtualenv
python3 -m virtualenv --python python3 env
source ./env/bin/activate

1.1.4. Install the package

pip install --upgrade datacatalog-custom-entries-manager

1.2. Docker

Docker may be used as an alternative to run datacatalog-custom-entries-manager. In this case, please disregard the above virtualenv setup instructions.

1.2.1. Get the source code

git clone https://github.com/ricardolsmendes/datacatalog-custom-entries-manager
cd ./datacatalog-custom-entries-manager

1.3. Auth credentials

1.3.1. Create a service account and grant it below roles

  • DataCatalog entryGroup Owner
  • DataCatalog entry Owner
  • Data Catalog Viewer

1.3.2. Download a JSON key and save it as

  • ./credentials/datacatalog-custom-entries-manager.json

1.3.3. Set the environment variables

This step can be skipped if you're using Docker.

export GOOGLE_APPLICATION_CREDENTIALS=./credentials/datacatalog-custom-entries-manager.json

2. Manage Custom Entries

2.1. Synchronize Data Catalog

2.1.1. To a CSV file

  • SAMPLE INPUT
  1. sample-input/csv for reference;
  2. Data Catalog Sample Custom Entries (Google Sheets) might help to create/export a CSV file.
  • COMMANDS

Python + virtualenv

datacatalog-custom-entries sync \
  --csv-file <CSV-FILE-PATH> \
  --project-id <YOUR-PROJECT-ID> --location-id <YOUR-LOCATION-ID>

Docker

docker build --rm --tag datacatalog-custom-entries-manager .
docker run --rm --tty \
  --volume <CREDENTIALS-FILE-FOLDER>:/credentials --volume <CSV-FILE-FOLDER>:/data \
  datacatalog-custom-entries-manager sync \
  --csv-file /data/<CSV-FILE-PATH> \
  --project-id <YOUR-PROJECT-ID> --location-id <YOUR-LOCATION-ID>

2.1.2. To a JSON file

  • SAMPLE INPUT
  1. sample-input/json for reference;
  • COMMANDS

Python + virtualenv

datacatalog-custom-entries sync \
  --json-file <JSON-FILE-PATH> \
  --project-id <YOUR-PROJECT-ID> --location-id <YOUR-LOCATION-ID>

Docker

docker build --rm --tag datacatalog-custom-entries-manager .
docker run --rm --tty \
  --volume <CREDENTIALS-FILE-FOLDER>:/credentials --volume <CSV-FILE-FOLDER>:/data \
  datacatalog-custom-entries-manager sync \
  --json-file <JSON-FILE-PATH> \
  --project-id <YOUR-PROJECT-ID> --location-id <YOUR-LOCATION-ID>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page