Skip to main content

A package to manage Google Cloud Data Catalog Tag export scripts

Project description

Datacatalog Tag Exporter

CircleCI PyPi License Issues

A Python package to manage Google Cloud Data Catalog Tag export scripts.

Disclaimer: This is not an officially supported Google product.

Table of Contents


Executing in Cloud Shell

# Set your SERVICE ACCOUNT, for instructions go to 1.3. Auth credentials
# This name is just a suggestion, feel free to name it following your naming conventions
export GOOGLE_APPLICATION_CREDENTIALS=~/datacatalog-tag-exporter-sa.json

# Install datacatalog-tag-exporter 
pip3 install datacatalog-tag-exporter --user

# Add to your PATH
export PATH=~/.local/bin:$PATH

# Look for available commands
datacatalog-tag-exporter --help

1. Environment setup

1.1. Python + virtualenv

Using virtualenv is optional, but strongly recommended unless you use Docker.

1.1.1. Install Python 3.6+

1.1.2. Get the source code

git clone https://github.com/mesmacosta/datacatalog-tag-exporter
cd ./datacatalog-tag-exporter

All paths starting with ./ in the next steps are relative to the datacatalog-tag-exporter folder.

1.1.3. Create and activate an isolated Python environment

pip install --upgrade virtualenv
python3 -m virtualenv --python python3 env
source ./env/bin/activate

1.1.4. Install the package

pip install --upgrade .

1.2. Docker

Docker may be used as an alternative to run the script. In this case, please disregard the Virtualenv setup instructions.

1.3. Auth credentials

1.3.1. Create a service account and grant it below roles

  • Data Catalog Admin

1.3.2. Download a JSON key and save it as

This name is just a suggestion, feel free to name it following your naming conventions

  • ./credentials/datacatalog-tag-exporter-sa.json

1.3.3. Set the environment variables

This step may be skipped if you're using Docker.

export GOOGLE_APPLICATION_CREDENTIALS=~/credentials/datacatalog-tag-exporter-sa.json

2. Export Tags to CSV file

2.1. A list of CSV files, each representing one Template will be created.

One file with summary with stats about each template, will also be created on the same directory.

The columns for the summary file are described as follows:

Column Description
template_name Resource name of the Tag Template for the Tag.
tags_count Number of tags found from the template.
tagged_entries_count Number of tagged entries with the template.
tagged_columns_count Number of tagged columns with the template.
tag_string_fields_count Number of used String fields on tags of the template.
tag_bool_fields_count Number of used Bool fields on tags of the template.
tag_double_fields_count Number of used Double fields on tags of the template.
tag_timestamp_fields_count Number of used Timestamp fields on tags of the template.
tag_enum_fields_count Number of used Enum fields on tags of the template.

The columns for each template file are described as follows:

Column Description
relative_resource_name Full resource name of the asset the Entry refers to.
linked_resource Full name of the asset the Entry refers to.
template_name Resource name of the Tag Template for the Tag.
tag_name Resource name of the Tag.
column Attach Tags to a column belonging to the Entry schema.
field_id Id of the Tag field.
field_type Type of the Tag field.
field_value Value of the Tag field.

2.2. Run the datacatalog-tag-exporter script

  • Python + virtualenv
datacatalog-tag-exporter tags export --project-ids my-project --dir-path DIR_PATH

2.2.1 Run the datacatalog-tag-exporter filtering Tag Templates

  • Python + virtualenv
datacatalog-tag-exporter tags export --project-ids my-project \
--dir-path DIR_PATH \
--tag-templates-names projects/my-project/locations/us-central1/tagTemplates/my-template,\
projects/my-project/locations/us-central1/tagTemplates/my-template-2 

History

0.1.0 (2020-04-15)

  • First release on PyPI.

0.2.0 (2020-05-08)

  • Added option to export tags after creation date.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacatalog-tag-exporter-0.3.2.tar.gz (15.1 kB view details)

Uploaded Source

Built Distribution

datacatalog_tag_exporter-0.3.2-py2.py3-none-any.whl (11.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file datacatalog-tag-exporter-0.3.2.tar.gz.

File metadata

  • Download URL: datacatalog-tag-exporter-0.3.2.tar.gz
  • Upload date:
  • Size: 15.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.0

File hashes

Hashes for datacatalog-tag-exporter-0.3.2.tar.gz
Algorithm Hash digest
SHA256 c4dde5910aef8db75db521d698f237876e3d889156d1680a5af2b845f1c437e5
MD5 62780dc5a2d3ccd707da28a2bd7d1893
BLAKE2b-256 78cdf261201e220d4952294adcdfa08de43427ca00b7d18177cc01ecae90768f

See more details on using hashes here.

File details

Details for the file datacatalog_tag_exporter-0.3.2-py2.py3-none-any.whl.

File metadata

  • Download URL: datacatalog_tag_exporter-0.3.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.0

File hashes

Hashes for datacatalog_tag_exporter-0.3.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f27a6b9cf7ce83a7219ce4628ab98dd452755c6cb66bf22b5b1ed964928088d9
MD5 57513682d9b3a9a984e6e56e2043729f
BLAKE2b-256 8eb0d78d9d5a766a7b1ac8ee17d0240f34c2f9199e5b223f4d18371c9a7ca356

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page