No project description provided
Project description
Simple DDL Parser
Parser tested on DDL for PostgreSQL & Hive.
If you have samples that cause an error - please open the issue, I will be glad to fix it.
This parser take as input SQL DDL statements or files, for example like this:
create table prod.super_table
(
data_sync_id bigint not null,
sync_count bigint not null,
sync_mark timestamp not null,
sync_start timestamp not null,
sync_end timestamp not null,
message varchar(2000) null,
primary key (data_sync_id, sync_start)
);
And produce output like this (information about table name, schema, columns, types and properties):
[{
'columns': [
{'name': 'data_sync_id', 'type': 'bigint', 'mode': False, 'size': None},
{'name': 'sync_count', 'type': 'bigint', 'mode': False, 'size': None},
{'name': 'sync_mark', 'type': 'timestamp', 'mode': False, 'size': None},
{'name': 'sync_start', 'type': 'timestamp', 'mode': False, 'size': None},
{'name': 'sync_end', 'type': 'timestamp', 'mode': False, 'size': None},
{'name': 'message', 'type': 'varchar', 'mode': False, 'size': 2000}],
'table_name': 'super_table', 'schema': 'prod',
'primary_key': ['data_sync_id', 'sync_start']
}]
Or one more example
CREATE TABLE "paths" (
"id" int PRIMARY KEY,
"title" varchar NOT NULL,
"description" varchar(160),
"created_at" timestamp,
"updated_at" timestamp
);
and result
[{
'columns': [
{'name': 'id', 'type': 'int', 'nullable': False, 'size': None},
{'name': 'title', 'type': 'varchar', 'nullable': False, 'size': None},
{'name': 'description', 'type': 'varchar', 'nullable': False, 'size': 160},
{'name': 'created_at', 'type': 'timestamp', 'nullable': False, 'size': None},
{'name': 'updated_at', 'type': 'timestamp', 'nullable': False, 'size': None}],
'primary_key': ['id'],
'table_name': 'paths', 'schema': ''
}]
If you pass file or text block with more when 1 CREATE TABLE statement when result will be list of such dicts. For example:
Input
CREATE TABLE "countries" (
"id" int PRIMARY KEY,
"code" varchar(4) NOT NULL,
"name" varchar NOT NULL
);
CREATE TABLE "path_owners" (
"user_id" int,
"path_id" int,
"type" int DEFAULT 1
);
Output:
[
{'columns': [
{'name': 'id', 'type': 'int', 'size': None, 'nullable': False, 'default': None},
{'name': 'code', 'type': 'varchar', 'size': 4, 'nullable': False, 'default': None},
{'name': 'name', 'type': 'varchar', 'size': None, 'nullable': False, 'default': None}],
'primary_key': ['id'],
'table_name': 'countries',
'schema': None},
{'columns': [
{'name': 'user_id', 'type': 'int', 'size': None, 'nullable': False, 'default': None},
{'name': 'path_id', 'type': 'int', 'size': None, 'nullable': False, 'default': None},
{'name': 'type', 'type': 'int', 'size': None, 'nullable': False, 'default': '1a'}],
'primary_key': [],
'table_name': 'path_owners',
'schema': None}
]
How to use
From python code
from simple_ddl_parser import DDLParser
parse_results = DDLParser("""create table dev.data_sync_history(
data_sync_id bigint not null,
sync_count bigint not null,
sync_mark timestamp not null,
sync_start timestamp not null,
sync_end timestamp not null,
message varchar(2000) null,
primary key (data_sync_id, sync_start)
); """).run()
print(parse_results)
To parse from file
from simple_ddl_parser import parse_from_file
result = parse_from_file('tests/test_one_statement.sql')
print(result)
More examples & tests
You can find in tests/functional folder.
Dump result in json
To dump result in json use argument .run(dump=True)
You also can provide a path where you want to have a dumps with schema with argument
TODO in next Releases
Support for references (Foreigein key) in column defenition
Support for separate ALTER TABLE statements for Foreigein keys like
ALTER TABLE "material_attachments" ADD FOREIGN KEY ("material_id") REFERENCES "materials" ("id");
Support for parse CREATE INDEX statements
Add command line
Historical context
This library is an extracted parser code from https://github.com/xnuinside/fakeme (Library for fake relation data generation, that I used in several work projects, but did not have time to make from it normal open source library)
For one of the work projects I needed to convert SQL ddl to Python ORM models in auto way and I tried to use https://github.com/andialbrecht/sqlparse but it works not well enough with ddl for my case (for example, if in ddl used lower case - nothing works, primary keys inside ddl are mapped as column name not reserved word and etc.). So I remembered about Parser in Fakeme and just extracted it & improved.
How to contribute
Please describe issue that you want to solve and open the PR, I will review it as soon as possible.
Any questions? Ping me in Telegram: https://t.me/xnuinside
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for simple-ddl-parser-0.2.1a0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ad705ebf546a418af1c99be44746bb1b61a1ee6e97d50e07f233c156ab9937f |
|
MD5 | d0c15e8592ce1c2bae29dc02e35ad737 |
|
BLAKE2b-256 | 03a7ff4fe0237e6856ebd7feabec57da1c09b27659f9fc68df2d6534e8f3507b |
Hashes for simple_ddl_parser-0.2.1a0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 345119acc67debf2149df9bfdba2f37cc8f7bf83c6f1020b4e20e8b5f9916fc0 |
|
MD5 | d988ac4d664e71729c4435bff234a56b |
|
BLAKE2b-256 | f530dcf14b7c89b2b29e90954d3f2c0a151fd3798476ada2c9ce59ed40573654 |