Skip to main content

Generates SQL DDL that will accept Python data

Project description

Infers SQL DDL (Data Definition Language) from table data.

Use at command line:

$ ddlgenerator -i postgresql '[{"Name": "Alfred", "species": "wart hog", "kg": 22}]'

DROP TABLE generated_table;
CREATE TABLE generated_table (
        name VARCHAR(6) NOT NULL,
        kg INTEGER NOT NULL,
        species VARCHAR(8) NOT NULL
)
;
INSERT INTO generated_table (kg, Name, species) VALUES (22, 'Alfred', 'wart hog');

Reads data from files:

$ ddlgenerator postgresql mydata.yaml > mytable.sql

Enables one-line creation of tables with their data

$ ddlgenerator –inserts postgresql mydata.json | psql

To use in Python:

>>> from ddlgenerator.ddlgenerator import Table
>>> table = Table([{"Name": "Alfred", "species": "wart hog", "kg": 22}])
>>> sql = table.sql('postgresql', inserts=True)

Supported data formats

  • Pure Python

  • YAML

  • JSON

  • CSV

  • Pickle

  • HTML

Features

  • Supports all SQL dialects supported by SQLAlchemy

  • Coerces data into most specific data type valid on all column’s values

  • Takes table name from file name

  • Guesses format of input data if unspecified by file extension

  • with -i/--inserts flag, adds INSERT statements

  • with -u/--uniques flag, surmises UNIQUE constraints from data

  • Handles nested data, creating child tables as needed

  • Reads HTML tables, including those embedded in noisy websites

Options

-h, --help            show this help message and exit
-k KEY, --key KEY     Field to use as primary key
-r, --reorder         Reorder fields alphabetically, ``key`` first
-u, --uniques         Include UNIQUE constraints where data is unique
-t, --text            Use variable-length TEXT columns instead of VARCHAR
-d, --drops           Include DROP TABLE statements
-i, --inserts         Include INSERT statements
--no-creates          Do not include CREATE TABLE statements
--save-metadata-to FILENAME
                      Save table definition in FILENAME for later --use-
                      saved-metadata run
--use-metadata-from FILENAME
                      Use metadata saved in FROM for table definition, do
                      not re-analyze table structure
-l LOG, --log LOG     log level (CRITICAL, FATAL, ERROR, DEBUG, INFO, WARN)

Generate SQLAlchemy models

Use sqlalchemy as the model to generate Python for defining SQLAlchemy models:

$ ddlgenerator sqlalchemy '[{"Name": "Alfred", "species": "wart hog", "kg": 22}]'

Table0 = Table('Table0', metadata,
  Column('species', Unicode(length=8), nullable=False),
  Column('kg', Integer(), nullable=False),
  Column('name', Unicode(length=6), nullable=False),
  schema=None)

Generate Django models

If Django is installed on the path then using django as the model will run the generated ddl through Django’s inspectdb management command to produce a model file:

$ ddlgenerator django '[{"Name": "Alfred", "species": "wart hog", "kg": 22}]'

# This is an auto-generated Django model module.
# You'll have to do the following manually to clean this up:
#   * Rearrange models' order
#   * Make sure each model has one field with primary_key=True
#   * Remove `managed = False` lines if you wish to allow Django to create and delete the table
# Feel free to rename the models, but don't rename db_table values or field names.
#
# Also note: You'll have to insert the output of 'django-admin.py sqlcustom [appname]'
# into your database.
from __future__ import unicode_literals

from django.db import models

class Table0(models.Model):
    species = models.CharField(max_length=8)
    kg = models.IntegerField()
    name = models.CharField(max_length=6)
    class Meta:
        managed = False
        db_table = 'Table0'

Large tables

As of now, ddlgenerator is not well-designed for table sizes approaching your system’s available memory.

One approach to save time and memory for large tables is to break your input data into multiple files, then run ddlgenerator with --save-metadata against a small but representative sample. Then run with --no-creates and -use-saved-metadata to generate INSERTs from the remaining files without needing to re-determine the column types each time.

Installing

Requires Python3.

From PyPI:

pip3 install ddlgenerator

From source:

git clone https://github.com/catherinedevlin/ddl-generator.git
cd ddl-generator
pip3 install .

Alternatives

Credits

  • Mike Bayer for sqlalchemy

  • coldfix and Mark Ransom for their StackOverflow answers

  • Audrey Roy for cookiecutter

  • Brandon Lorenz for Django model generation

History

0.1.0 (2014-03-22)

  • First release on PyPI.

0.1.2 (2014-07-15)

  • data_dispenser moved to separate module

0.1.3 (2014-07-16)

  • Bugfix for long integers found after short strings

0.1.4 (2014-07-25)

  • Fixed bug: external data_dispenser unused by 0.1.3!

0.1.5 (2014-07-25)

  • sqlalchemy pseudo-dialect added

0.1.6 (2014-07-25)

  • Generate sqlalchemy inserts

0.1.7 (2014-09-14)

  • Read via HTTP

  • Support HTML format

  • Generate Django models

0.1.7.1 (2014-09-14)

  • Require data-dispenser 0.2.3

0.1.7.3 (2014-10-19)

  • Require all formerly recommended dependencies, for simplicity

  • Several bugfixes for complex number-like fields

0.1.8 (2015-02-01)

  • UNIQUE contstraints handled properly in sqlalchemy output

0.1.8.2 (2015-02-05)

  • Cleaner SQLAlchemy generation for fixtures

0.1.9 (2015-02-10)

  • README fixes from Anatoly Technonik, Mikhail Podgurskiy

  • Parse args passed to generate(args, namespace) for non-command-line use

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ddlgenerator-0.1.9.tar.gz (26.1 kB view details)

Uploaded Source

Built Distribution

ddlgenerator-0.1.9-py3.4.egg (49.5 kB view details)

Uploaded Source

File details

Details for the file ddlgenerator-0.1.9.tar.gz.

File metadata

File hashes

Hashes for ddlgenerator-0.1.9.tar.gz
Algorithm Hash digest
SHA256 b34a571c5ef79ff27cf84205d8ef2252058d804142aa678a92b0ebe2e6f6dfc4
MD5 9f0bc3ebecbd7288b8e58a727c8c9413
BLAKE2b-256 d1be120e81fc9e5b5ec56dd8ab8a743d05937e2eddb3ec8445807cb1722094f1

See more details on using hashes here.

File details

Details for the file ddlgenerator-0.1.9-py3.4.egg.

File metadata

File hashes

Hashes for ddlgenerator-0.1.9-py3.4.egg
Algorithm Hash digest
SHA256 7cc9993b6884bc11d9d4a9db8bd25441f1f895e5733c8518e446dbbd19dd56ed
MD5 5723eb559c8fade14e5f01a3e9ed36eb
BLAKE2b-256 3a1533565e7436749e6dc6947b5d31c503b7be1d872d6ce840647d0771cc842b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page