Skip to main content

A package to manage Google Cloud Data Catalog Fileset export scripts

Project description

Datacatalog Fileset Exporter

CircleCI PyPi License Issues

A Python package to manage Google Cloud Data Catalog Fileset export scripts.

Disclaimer: This is not an officially supported Google product.

Executing in Cloud Shell

# Set your SERVICE ACCOUNT, for instructions go to 1.3. Auth credentials
# This name is just a suggestion, feel free to name it following your naming conventions
export GOOGLE_APPLICATION_CREDENTIALS=~/datacatalog-fileset-exporter-sa.json

# Install datacatalog-fileset-exporter
pip3 install datacatalog-fileset-exporter --user

# Add to your PATH
export PATH=~/.local/bin:$PATH

# Look for available commands
datacatalog-fileset-exporter --help

Open in Cloud Shell

1. Environment setup

1.1. Python + virtualenv

Using virtualenv is optional, but strongly recommended unless you use Docker.

1.1.1. Install Python 3.6+

1.1.2. Get the source code

git clone https://github.com/mesmacosta/datacatalog-fileset-exporter
cd ./datacatalog-fileset-exporter

All paths starting with ./ in the next steps are relative to the datacatalog-fileset-exporter folder.

1.1.3. Create and activate an isolated Python environment

pip install --upgrade virtualenv
python3 -m virtualenv --python python3 env
source ./env/bin/activate

1.1.4. Install the package

pip install --upgrade .

1.2. Docker

Docker may be used as an alternative to run the script. In this case, please disregard the Virtualenv setup instructions.

1.3. Auth credentials

1.3.1. Create a service account and grant it below roles

  • Data Catalog Admin

1.3.2. Download a JSON key and save it as

This name is just a suggestion, feel free to name it following your naming conventions

  • ./credentials/datacatalog-fileset-exporter-sa.json

1.3.3. Set the environment variables

This step may be skipped if you're using Docker.

export GOOGLE_APPLICATION_CREDENTIALS=~/credentials/datacatalog-fileset-exporter-sa.json

5. Export Filesets to CSV file

5.1. A CSV file representing the Filesets will be created

Filesets are composed of as many lines as required to represent all of their fields. The columns are described as follows:

Column Description Mandatory
entry_group_name Entry Group Name. Y
entry_group_display_name Entry Group Display Name. Y
entry_group_description Entry Group Description. Y
entry_id Entry ID. Y
entry_display_name Entry Display Name. Y
entry_description Entry Description. Y
entry_file_patterns Entry File Patterns. Y
schema_column_name Schema column name. N
schema_column_type Schema column type. N
schema_column_description Schema column description. N
schema_column_mode Schema column mode. N

5.2. Run the datacatalog-fileset-exporter script

  • Python + virtualenv
datacatalog-fileset-exporter filesets export --project-ids my-project --file-path CSV_FILE_PATH

History

0.1.0 (2020-04-28)

  • First release on PyPI.

0.2.0 (2020-05-08)

  • ADD option to use filesets creation date.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacatalog-fileset-exporter-0.2.0.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

datacatalog_fileset_exporter-0.2.0-py2.py3-none-any.whl (9.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file datacatalog-fileset-exporter-0.2.0.tar.gz.

File metadata

  • Download URL: datacatalog-fileset-exporter-0.2.0.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6

File hashes

Hashes for datacatalog-fileset-exporter-0.2.0.tar.gz
Algorithm Hash digest
SHA256 fb77454abfe4a2bc55e7d49c6f508eeb0eb1965d451351170d39f2e120d83e54
MD5 73eda386d5f74813b3c2049508fed7fa
BLAKE2b-256 b4a56c4f620b7040c7bc0cc8086d8c7afb6a858646b2329c6cdde25d64d933ae

See more details on using hashes here.

File details

Details for the file datacatalog_fileset_exporter-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: datacatalog_fileset_exporter-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6

File hashes

Hashes for datacatalog_fileset_exporter-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 2eccdd7d6b8f214c65fd52b1703a81ce3f2fa19931a175647e5643a86508b3cd
MD5 8e3e381b7614f2da5027177eb0b986e6
BLAKE2b-256 4bb21b6301ca5485ca183c5968e17130e6e6ebedac00eca72371df09b9fdb04e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page