CUREd+ metadata tool: generates a list of all the columns in every table in the database.
Project description
CUREd+ metadata generator
The CUREd+ metadata generator tool generates a list of all the columns in every table in the database.
The data in the target bucket must be arranged in the following directory structure: <data_set_id>/<table_id>/data/*.parquet
This script will generate a CSV file with the following columns:
data_set_idtable_idcolumn_namedata_type
Installation
Ensure Python is installed. (See this tutorial.)
Install AWS command-line interface (CLI).
Configure your access key using the
aws configure command.
Install this package using the Python package manager:
pip install curedcolumns
Usage
The basic usage of this app is to specify the AWS CLI profile and the bucket name you want to inspect.
curedcolumns --profile $AWS_PROFILE $AWS_BUCKET --output $OUTPUT_FILE
You should create an AWS profile using the aws configure command.
aws configure --profile $AWS_PROFILE
To view the command line options:
$ curedcolumns --help
usage: curedcolumns [-h] [-v] [--version] [-l LOGLEVEL] [--prefix PREFIX] [--no-sign-request] [--profile PROFILE] [-d DELIMITER] [-o OUTPUT] [-f] bucket
List all the field names for all the data sets in a bucket on AWS S3 object storage and display the metadata in CSV format. This assumes a folder structure in this layout: <data_set_id>/<table_id>/data/*.parquet
positional arguments:
bucket S3 bucket location URI
options:
-h, --help show this help message and exit
-v, --verbose
--version Show the version number of this tool
-l LOGLEVEL, --loglevel LOGLEVEL
--prefix PREFIX Limits the response to keys that begin with the specified prefix.
--no-sign-request
--profile PROFILE AWS profile to use
-d DELIMITER, --delimiter DELIMITER
Column separator character
-o OUTPUT, --output OUTPUT
Output file path. Default: screen
-f, --force Overwrite output file if it already exists
Example
Use the AWS CLI profile named "clean"
curedcolumns --profile clean s3://my_bucket.aws.com
Development
See CONTRIBUTING.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file curedcolumns-0.1.3.tar.gz.
File metadata
- Download URL: curedcolumns-0.1.3.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47e61cc01c71eee0bab990611142ea525deafb9a52b09532f3b7c43c96628d25
|
|
| MD5 |
d7e563004de77b35701a0c0aaa485646
|
|
| BLAKE2b-256 |
520846940ed02d756653b5aa64d7260a2925d210eee31db860037e887bd72a27
|
File details
Details for the file curedcolumns-0.1.3-py3-none-any.whl.
File metadata
- Download URL: curedcolumns-0.1.3-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a335636ee79ac076934055e757a3470dfa9c34f4d687ba0274e6fc87116007c4
|
|
| MD5 |
566b6d9d327970f16734c491e49a216b
|
|
| BLAKE2b-256 |
2dcb6a3d5c93bf8acbd19d8284ef1a15369d3f6678ee78a31fd0c8a5b2f7affb
|