Skip to main content

A package to manage Google Cloud Data Catalog Tag export scripts

Project description

Datacatalog Tag Exporter

CircleCI PyPi License Issues

A Python package to manage Google Cloud Data Catalog Tag export scripts.

Disclaimer: This is not an officially supported Google product.

Executing in Cloud Shell

# Set your SERVICE ACCOUNT, for instructions go to 1.3. Auth credentials
# This name is just a suggestion, feel free to name it following your naming conventions
export GOOGLE_APPLICATION_CREDENTIALS=~/datacatalog-tag-exporter-sa.json

# Install datacatalog-tag-exporter 
pip3 install datacatalog-tag-exporter --user

# Add to your PATH
export PATH=~/.local/bin:$PATH

# Look for available commands
datacatalog-tag-exporter --help

1. Environment setup

1.1. Python + virtualenv

Using virtualenv is optional, but strongly recommended unless you use Docker.

1.1.1. Install Python 3.6+

1.1.2. Get the source code

git clone https://github.com/mesmacosta/datacatalog-tag-exporter
cd ./datacatalog-tag-exporter

All paths starting with ./ in the next steps are relative to the datacatalog-tag-exporter folder.

1.1.3. Create and activate an isolated Python environment

pip install --upgrade virtualenv
python3 -m virtualenv --python python3 env
source ./env/bin/activate

1.1.4. Install the package

pip install --upgrade .

1.2. Docker

Docker may be used as an alternative to run the script. In this case, please disregard the Virtualenv setup instructions.

1.3. Auth credentials

1.3.1. Create a service account and grant it below roles

  • Data Catalog Admin

1.3.2. Download a JSON key and save it as

This name is just a suggestion, feel free to name it following your naming conventions

  • ./credentials/datacatalog-tag-exporter-sa.json

1.3.3. Set the environment variables

This step may be skipped if you're using Docker.

export GOOGLE_APPLICATION_CREDENTIALS=~/credentials/datacatalog-tag-exporter-sa.json

2. Export Tags to CSV file

2.1. A list of CSV files, each representing one Template will be created.

One file with summary with stats about each template, will also be created on the same directory.

The columns for the summary file are described as follows:

Column Description
template_name Resource name of the Tag Template for the Tag.
tags_count Number of tags found from the template.
tagged_entries_count Number of tagged entries with the template.
tagged_columns_count Number of tagged columns with the template.
tag_string_fields_count Number of used String fields on tags of the template.
tag_bool_fields_count Number of used Bool fields on tags of the template.
tag_double_fields_count Number of used Double fields on tags of the template.
tag_timestamp_fields_count Number of used Timestamp fields on tags of the template.
tag_enum_fields_count Number of used Enum fields on tags of the template.

The columns for each template file are described as follows:

Column Description
relative_resource_name Full resource name of the asset the Entry refers to.
linked_resource Full name of the asset the Entry refers to.
template_name Resource name of the Tag Template for the Tag.
tag_name Resource name of the Tag.
column Attach Tags to a column belonging to the Entry schema.
field_id Id of the Tag field.
field_type Type of the Tag field.
field_value Value of the Tag field.

2.2. Run the datacatalog-tag-exporter script

  • Python + virtualenv
datacatalog-tag-exporter tags export --project-ids my-project --dir-path DIR_PATH

History

0.1.0 (2020-04-15)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacatalog-tag-exporter-0.1.0.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

datacatalog_tag_exporter-0.1.0-py2.py3-none-any.whl (10.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file datacatalog-tag-exporter-0.1.0.tar.gz.

File metadata

  • Download URL: datacatalog-tag-exporter-0.1.0.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6

File hashes

Hashes for datacatalog-tag-exporter-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cebb3e6c1a0962188a787e21dcefa87982ea9d67d476559da3eabf395062291c
MD5 9f39f882b8b1b724c697672a90675130
BLAKE2b-256 c18d86409864b58655c9fb340dc39891cd2a331a69e868bfd99e55615848e752

See more details on using hashes here.

File details

Details for the file datacatalog_tag_exporter-0.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: datacatalog_tag_exporter-0.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6

File hashes

Hashes for datacatalog_tag_exporter-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 49a823bc81c8776fb2e575656d4d1d332effce28697cbe6f80b4d82e62eac682
MD5 e7a6ff372238c3c2491ded8aa18e8ed9
BLAKE2b-256 725ad0778d09c438a4ee3fac74ee5b4f843983494f368abd8413935bf4072803

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page