Auto-generate Redshift schemas from flat files
Project description
Redshift Auto Schema
Redshift Auto Schema is a Python library that takes a delimited flat file or parquet file as input, parses it, and provides a variety of functions that allow for the creation and validation of tables within Amazon Redshift. For each field, the appropriate Redshift data type is inferred from the contents of the file.
Installation
Use the package manager pip to install Redshift Auto Schema.
pip install redshift-auto-schema
Usage
from redshift_auto_schema import RedshiftAutoSchema
import psycopg2 as pg
redshift_conn = pg.connect()
new_table = RedshiftAutoSchema(file='sample_file.parquet',
schema='test_schema',
table='test_table',
conn=redshift_conn)
if not new_table.check_table_existence():
ddl = new_table.generate_table_ddl()
with redshift_conn.cursor() as redshift_cursor:
redshift_cursor.execute(ddl)
Methods
NAME | DESCRIPTION |
---|---|
get_column_list | Returns column list based on header of file. |
check_schema_existence | Checks Redshift for the existence of a schema. |
check_table_existence | Checks Redshift for the existence of a table. |
generate_schema_ddl | Returns a SQL statement that creates a Redshift schema. |
generate_schema_permissions | Returns a SQL statement that grants schema usage to the default group. |
generate_table_ddl | Returns a SQL statement that creates a Redshift table. |
generate_column_ddl | Returns SQL statement(s) that adds missing column(s) a Redshift table. |
generate_table_permissions | Returns a SQL statement that grants table read access to the default group. |
evaluate_table_ddl_diffs | Returns a dataframe containing differences between generated and existing table DDL. |
Contributing
Pull requests are welcome.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file redshift-auto-schema-0.1.10.tar.gz
.
File metadata
- Download URL: redshift-auto-schema-0.1.10.tar.gz
- Upload date:
- Size: 5.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4136bc7cf894499c157a1e2b026262083564a56e15d85456082b48b0da4a6061 |
|
MD5 | 00a2c4ad61e3c070203d84d369059051 |
|
BLAKE2b-256 | e8b8ab35a1a74bf37658b6e5ea20d1a8c705e699dee7a81effb03e319c98f93f |
File details
Details for the file redshift_auto_schema-0.1.10-py3-none-any.whl
.
File metadata
- Download URL: redshift_auto_schema-0.1.10-py3-none-any.whl
- Upload date:
- Size: 10.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 921faee9fc488f4cbb04496acfdd5f1e41037eee7dd6c762116b3d9efb7777af |
|
MD5 | fc03b66fda45ca9516610248063d9d64 |
|
BLAKE2b-256 | 49776f020163b7a637d03a17d5e6f2528b245d03a797e91acb6196e3fc0e567b |