Skip to main content

DDL parase and Convert to BigQuery JSON schema

Project description

DDL Parse

PyPI version Python version Travis CI Build Status Coveralls Coverage Status codecov Coverage Status Requirements Status License

DDL parase and Convert to BigQuery JSON schema and DDL statements module, available in Python.


Features

  • DDL parse and get table schema information.
  • Currently, only the CREATE TABLE statement is supported.
  • Convert to BigQuery JSON schema and BigQuery DDL statements.
  • Supported databases are MySQL/MariaDB, PostgreSQL, Oracle, Redshift.

Requirement

  1. Python >= 3.5
  2. pyparsing

Installation

Install

pip install:

$ pip install ddlparse

command install:

$ python setup.py install

Update

pip update:

$ pip install ddlparse --upgrade

Usage

Example

import json

from ddlparse import DdlParse

sample_ddl = """
CREATE TABLE My_Schema.Sample_Table (
  Id integer PRIMARY KEY COMMENT 'User ID',
  Name varchar(100) NOT NULL COMMENT 'User name',
  Total bigint NOT NULL,
  Avg decimal(5,1) NOT NULL,
  Point int(10) unsigned,
  Zerofill_Id integer unsigned zerofill NOT NULL,
  Created_At date, -- Oracle 'DATE' -> BigQuery 'DATETIME'
  UNIQUE (NAME)
);
"""


# parse pattern (1-1)
table = DdlParse().parse(sample_ddl)

# parse pattern (1-2) : Specify source database
table = DdlParse().parse(ddl=sample_ddl, source_database=DdlParse.DATABASE.oracle)


# parse pattern (2-1)
parser = DdlParse(sample_ddl)
table = parser.parse()

print("* BigQuery Fields * : normal")
print(table.to_bigquery_fields())


# parse pattern (2-2) : Specify source database
parser = DdlParse(ddl=sample_ddl, source_database=DdlParse.DATABASE.oracle)
table = parser.parse()


# parse pattern (3-1)
parser = DdlParse()
parser.ddl = sample_ddl
table = parser.parse()

# parse pattern (3-2) : Specify source database
parser = DdlParse()
parser.source_database = DdlParse.DATABASE.oracle
parser.ddl = sample_ddl
table = parser.parse()

print("* BigQuery Fields * : Oracle")
print(table.to_bigquery_fields())


print("* TABLE *")
print("schema = {} : name = {} : is_temp = {}".format(table.schema, table.name, table.is_temp))

print("* BigQuery Fields *")
print(table.to_bigquery_fields())

print("* BigQuery Fields - column name to lower case / upper case *")
print(table.to_bigquery_fields(DdlParse.NAME_CASE.lower))
print(table.to_bigquery_fields(DdlParse.NAME_CASE.upper))

print("* COLUMN *")
for col in table.columns.values():
    col_info = {}

    col_info["name"]                  = col.name
    col_info["data_type"]             = col.data_type
    col_info["length"]                = col.length
    col_info["precision(=length)"]    = col.precision
    col_info["scale"]                 = col.scale
    col_info["is_unsigned"]           = col.is_unsigned
    col_info["is_zerofill"]           = col.is_zerofill
    col_info["constraint"]            = col.constraint
    col_info["not_null"]              = col.not_null
    col_info["PK"]                    = col.primary_key
    col_info["unique"]                = col.unique
    col_info["auto_increment"]        = col.auto_increment
    col_info["distkey"]               = col.distkey
    col_info["sortkey"]               = col.sortkey
    col_info["encode"]                = col.encode
    col_info["default"]               = col.default
    col_info["character_set"]         = col.character_set
    col_info["bq_legacy_data_type"]   = col.bigquery_legacy_data_type
    col_info["bq_standard_data_type"] = col.bigquery_standard_data_type
    col_info["comment"]               = col.comment
    col_info["description(=comment)"] = col.description
    col_info["bigquery_field"]        = json.loads(col.to_bigquery_field())

    print(json.dumps(col_info, indent=2, ensure_ascii=False))

print("* DDL (CREATE TABLE) statements *")
print(table.to_bigquery_ddl())

print("* DDL (CREATE TABLE) statements - dataset name, table name and column name to lower case / upper case *")
print(table.to_bigquery_ddl(DdlParse.NAME_CASE.lower))
print(table.to_bigquery_ddl(DdlParse.NAME_CASE.upper))

print("* Get Column object (case insensitive) *")
print(table.columns["total"])
print(table.columns["total"].data_type)

License

BSD 3-Clause License

Author

Shinichi Takii shinichi.takii@shaketh.com

Links

Special Thanks

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ddlparse-1.10.0.tar.gz (21.4 kB view details)

Uploaded Source

Built Distribution

ddlparse-1.10.0-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file ddlparse-1.10.0.tar.gz.

File metadata

  • Download URL: ddlparse-1.10.0.tar.gz
  • Upload date:
  • Size: 21.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.3

File hashes

Hashes for ddlparse-1.10.0.tar.gz
Algorithm Hash digest
SHA256 6418681baa848eb01251ab79eb3d0ad7e140e6ab1deaae5a019353ddb3a908da
MD5 fc58e30162d5540e0a696094227487bb
BLAKE2b-256 1b3ca9ece2adfd4ee8f5030669804df58532d5b8934e955e1399e11379af535e

See more details on using hashes here.

File details

Details for the file ddlparse-1.10.0-py3-none-any.whl.

File metadata

  • Download URL: ddlparse-1.10.0-py3-none-any.whl
  • Upload date:
  • Size: 10.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.3

File hashes

Hashes for ddlparse-1.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 71761b3457c8720853af3aeef266e2da1b6edef50936969492d586d7046a2ac2
MD5 87803d242f4a4789b583c5b401d14827
BLAKE2b-256 bc62dd55bb59bacfb1f6551e04568863492be6b7a079c707fccbfde82de97f95

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page