Skip to main content

A package to manage Google Cloud Data Catalog tags, loading metadata from external sources

Project description

datacatalog-tag-manager

A Python package to manage Google Cloud Data Catalog tags, loading metadata from external sources.

CircleCI

1. Environment setup

1.1. Python + virtualenv

Using virtualenv is optional, but strongly recommended unless you use Docker.

1.1.1. Install Python 3.6+

1.1.2. Create a folder

This is recommended so all related stuff will reside at same place, making it easier to follow below instructions.

mkdir ./datacatalog-tag-manager
cd ./datacatalog-tag-manager

All paths starting with ./ in the next steps are relative to the datacatalog-tag-manager folder.

1.1.3. Create and activate an isolated Python environment

pip install --upgrade virtualenv
python3 -m virtualenv --python python3 env
source ./env/bin/activate

1.1.4. Install the package

pip install --upgrade datacatalog-tag-manager

1.2. Docker

Docker may be used as an alternative to run the script. In this case, please disregard the Virtualenv setup instructions.

1.2.1. Get the source code

git clone https://github.com/ricardolsmendes/datacatalog-tag-manager
cd ./datacatalog-tag-manager

1.3. Auth credentials

1.3.1. Create a service account and grant it below roles

  • BigQuery Metadata Viewer
  • Data Catalog TagTemplate User
  • A custom role with bigquery.datasets.updateTag and bigquery.tables.updateTag permissions

1.3.2. Download a JSON key and save it as

  • ./credentials/datacatalog-tag-manager.json

1.3.3. Set the environment variables

This step may be skipped if you're using Docker.

export GOOGLE_APPLICATION_CREDENTIALS=./credentials/datacatalog-tag-manager.json

2. Load Tags from CSV file

2.1. Create a CSV file representing the Tags to be created

Tags are composed of as many lines as required to represent all of their fields. The columns are described as follows:

Column Description Mandatory
linked_resource Full name of the asset the Entry refers to. Y
template_name Resource name of the Tag Template for the Tag. Y
column Attach Tags to a column belonging to the Entry schema. N
field_id Id of the Tag field. Y
field_value Value of the Tag field. Y

TIPS

2.2. Run the datacatalog-tag-manager script

  • Python + virtualenv
datacatalog-tag-manager create-tags --csv-file CSV_FILE_PATH
  • Docker
docker build --rm --tag datacatalog-tag-manager .
docker run --rm --tty \
  --volume CREDENTIALS_FILE_FOLDER:/credentials --volume CSV_FILE_FOLDER:/data \
  datacatalog-tag-manager create-tags --csv-file /data/CSV_FILE_NAME

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacatalog-tag-manager-0.1.4.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

datacatalog_tag_manager-0.1.4-py2.py3-none-any.whl (9.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file datacatalog-tag-manager-0.1.4.tar.gz.

File metadata

  • Download URL: datacatalog-tag-manager-0.1.4.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9

File hashes

Hashes for datacatalog-tag-manager-0.1.4.tar.gz
Algorithm Hash digest
SHA256 e6615512607301886e5f73b7801afb50f4921dd3bf862f9b012e9324cb2b0939
MD5 53e64a45cec02a0d8f107a03ef1f085a
BLAKE2b-256 a61d0f0fae852567e6fefd33646aad590e2cf12bf21f3e1300653ed5e4c0e30e

See more details on using hashes here.

File details

Details for the file datacatalog_tag_manager-0.1.4-py2.py3-none-any.whl.

File metadata

  • Download URL: datacatalog_tag_manager-0.1.4-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.6.9

File hashes

Hashes for datacatalog_tag_manager-0.1.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 4e6ac7323823b5c6f5d3b647dce20e6930afea55b7eb29a132b9319fef71a8d5
MD5 911b4cbe5ae388c697f0258b6dbc8ea3
BLAKE2b-256 81051c60bee1baad952e8f1b463faa05146c8dda643d83ed3cd1be8688a51211

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page