A package to manage Google Cloud Data Catalog Fileset export scripts
Project description
Datacatalog Fileset Exporter
A Python package to manage Google Cloud Data Catalog Fileset export scripts.
Disclaimer: This is not an officially supported Google product.
Table of Contents
- Executing in Cloud Shell
- 1. Environment setup
- 2. Export Filesets to CSV file
Executing in Cloud Shell
# Set your SERVICE ACCOUNT, for instructions go to 1.3. Auth credentials
# This name is just a suggestion, feel free to name it following your naming conventions
export GOOGLE_APPLICATION_CREDENTIALS=~/datacatalog-fileset-exporter-sa.json
# Install datacatalog-fileset-exporter
pip3 install datacatalog-fileset-exporter --user
# Add to your PATH
export PATH=~/.local/bin:$PATH
# Look for available commands
datacatalog-fileset-exporter --help
1. Environment setup
1.1. Python + virtualenv
Using virtualenv is optional, but strongly recommended unless you use Docker.
1.1.1. Install Python 3.6+
1.1.2. Get the source code
git clone https://github.com/mesmacosta/datacatalog-fileset-exporter
cd ./datacatalog-fileset-exporter
All paths starting with ./
in the next steps are relative to the datacatalog-fileset-exporter
folder.
1.1.3. Create and activate an isolated Python environment
pip install --upgrade virtualenv
python3 -m virtualenv --python python3 env
source ./env/bin/activate
1.1.4. Install the package
pip install --upgrade .
1.2. Docker
Docker may be used as an alternative to run the script. In this case, please disregard the Virtualenv setup instructions.
1.3. Auth credentials
1.3.1. Create a service account and grant it below roles
- Data Catalog Admin
1.3.2. Download a JSON key and save it as
This name is just a suggestion, feel free to name it following your naming conventions
./credentials/datacatalog-fileset-exporter-sa.json
1.3.3. Set the environment variables
This step may be skipped if you're using Docker.
export GOOGLE_APPLICATION_CREDENTIALS=~/credentials/datacatalog-fileset-exporter-sa.json
2. Export Filesets to CSV file
2.1. A CSV file representing the Filesets will be created
Filesets are composed of as many lines as required to represent all of their fields. The columns are described as follows:
Column | Description | Mandatory |
---|---|---|
entry_group_name | Entry Group Name. | Y |
entry_group_display_name | Entry Group Display Name. | Y |
entry_group_description | Entry Group Description. | Y |
entry_id | Entry ID. | Y |
entry_display_name | Entry Display Name. | Y |
entry_description | Entry Description. | Y |
entry_file_patterns | Entry File Patterns. | Y |
schema_column_name | Schema column name. | N |
schema_column_type | Schema column type. | N |
schema_column_description | Schema column description. | N |
schema_column_mode | Schema column mode. | N |
2.2. Run the datacatalog-fileset-exporter script
- Python + virtualenv
datacatalog-fileset-exporter filesets export --project-ids my-project --file-path CSV_FILE_PATH
History
0.1.0 (2020-04-28)
- First release on PyPI.
0.2.0 (2020-05-08)
- ADD option to use filesets creation date.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file datacatalog-fileset-exporter-0.2.3.tar.gz
.
File metadata
- Download URL: datacatalog-fileset-exporter-0.2.3.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e814d3ef61ba431720edcf05b9feb1a4ead0b8900009137f4c2c20db302a309f |
|
MD5 | e8e9a891890b0488fb135164a8b901ef |
|
BLAKE2b-256 | 52e41b1f50d0e9ed96f2f3d848ae6eadfbddc09ddaddf829730791dae730ecd3 |
File details
Details for the file datacatalog_fileset_exporter-0.2.3-py2.py3-none-any.whl
.
File metadata
- Download URL: datacatalog_fileset_exporter-0.2.3-py2.py3-none-any.whl
- Upload date:
- Size: 9.7 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 54f1d6e05b0d3fe7b147f1910c3ac9f18f6a3f4262a7791b0bc6211104e4c887 |
|
MD5 | 278ad0690c7bdafc1811a53a0f79fbd5 |
|
BLAKE2b-256 | 6b7d6aff5f14b4c8e63fd7249d4ed829730545193264b27ec638ab9b27d6f834 |