A package to manage Google Cloud Data Catalog Tag export scripts
Project description
Datacatalog Tag Exporter
A Python package to manage Google Cloud Data Catalog Tag export scripts.
Disclaimer: This is not an officially supported Google product.
Table of Contents
- Executing in Cloud Shell
- 1. Environment setup
- 2. Export Tags to CSV file
Executing in Cloud Shell
# Set your SERVICE ACCOUNT, for instructions go to 1.3. Auth credentials
# This name is just a suggestion, feel free to name it following your naming conventions
export GOOGLE_APPLICATION_CREDENTIALS=~/datacatalog-tag-exporter-sa.json
# Install datacatalog-tag-exporter
pip3 install datacatalog-tag-exporter --user
# Add to your PATH
export PATH=~/.local/bin:$PATH
# Look for available commands
datacatalog-tag-exporter --help
1. Environment setup
1.1. Python + virtualenv
Using virtualenv is optional, but strongly recommended unless you use Docker.
1.1.1. Install Python 3.6+
1.1.2. Get the source code
git clone https://github.com/mesmacosta/datacatalog-tag-exporter
cd ./datacatalog-tag-exporter
All paths starting with ./
in the next steps are relative to the datacatalog-tag-exporter
folder.
1.1.3. Create and activate an isolated Python environment
pip install --upgrade virtualenv
python3 -m virtualenv --python python3 env
source ./env/bin/activate
1.1.4. Install the package
pip install --upgrade .
1.2. Docker
Docker may be used as an alternative to run the script. In this case, please disregard the Virtualenv setup instructions.
1.3. Auth credentials
1.3.1. Create a service account and grant it below roles
- Data Catalog Admin
1.3.2. Download a JSON key and save it as
This name is just a suggestion, feel free to name it following your naming conventions
./credentials/datacatalog-tag-exporter-sa.json
1.3.3. Set the environment variables
This step may be skipped if you're using Docker.
export GOOGLE_APPLICATION_CREDENTIALS=~/credentials/datacatalog-tag-exporter-sa.json
2. Export Tags to CSV file
2.1. A list of CSV files, each representing one Template will be created.
One file with summary with stats about each template, will also be created on the same directory.
The columns for the summary file are described as follows:
Column | Description |
---|---|
template_name | Resource name of the Tag Template for the Tag. |
tags_count | Number of tags found from the template. |
tagged_entries_count | Number of tagged entries with the template. |
tagged_columns_count | Number of tagged columns with the template. |
tag_string_fields_count | Number of used String fields on tags of the template. |
tag_bool_fields_count | Number of used Bool fields on tags of the template. |
tag_double_fields_count | Number of used Double fields on tags of the template. |
tag_timestamp_fields_count | Number of used Timestamp fields on tags of the template. |
tag_enum_fields_count | Number of used Enum fields on tags of the template. |
The columns for each template file are described as follows:
Column | Description |
---|---|
relative_resource_name | Full resource name of the asset the Entry refers to. |
linked_resource | Full name of the asset the Entry refers to. |
template_name | Resource name of the Tag Template for the Tag. |
tag_name | Resource name of the Tag. |
column | Attach Tags to a column belonging to the Entry schema. |
field_id | Id of the Tag field. |
field_type | Type of the Tag field. |
field_value | Value of the Tag field. |
2.2. Run the datacatalog-tag-exporter script
- Python + virtualenv
datacatalog-tag-exporter tags export --project-ids my-project --dir-path DIR_PATH
2.2.1 Run the datacatalog-tag-exporter filtering Tag Templates
- Python + virtualenv
datacatalog-tag-exporter tags export --project-ids my-project \
--dir-path DIR_PATH \
--tag-templates-names projects/my-project/locations/us-central1/tagTemplates/my-template,\
projects/my-project/locations/us-central1/tagTemplates/my-template-2
History
0.1.0 (2020-04-15)
- First release on PyPI.
0.2.0 (2020-05-08)
- Added option to export tags after creation date.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file datacatalog-tag-exporter-0.3.2.tar.gz
.
File metadata
- Download URL: datacatalog-tag-exporter-0.3.2.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c4dde5910aef8db75db521d698f237876e3d889156d1680a5af2b845f1c437e5 |
|
MD5 | 62780dc5a2d3ccd707da28a2bd7d1893 |
|
BLAKE2b-256 | 78cdf261201e220d4952294adcdfa08de43427ca00b7d18177cc01ecae90768f |
File details
Details for the file datacatalog_tag_exporter-0.3.2-py2.py3-none-any.whl
.
File metadata
- Download URL: datacatalog_tag_exporter-0.3.2-py2.py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f27a6b9cf7ce83a7219ce4628ab98dd452755c6cb66bf22b5b1ed964928088d9 |
|
MD5 | 57513682d9b3a9a984e6e56e2043729f |
|
BLAKE2b-256 | 8eb0d78d9d5a766a7b1ac8ee17d0240f34c2f9199e5b223f4d18371c9a7ca356 |