Skip to main content

Loader to import sqlite3 databases into Postgres

Project description

pgsqlite

Load SQLite3 databases into PostgresSQL.

Usage:

usage: pgsqlite.py [-h] -f SQLITE_FILENAME -p POSTGRES_CONNECT_URL [-d DEBUG] [--drop_tables DROP_TABLES] [--drop_everything DROP_EVERYTHING] [--drop_tables_after_import DROP_TABLES_AFTER_IMPORT]

optional arguments:
  -h, --help            show this help message and exit
  -f SQLITE_FILENAME, --sqlite_filename SQLITE_FILENAME
                        sqlite database to import
  -p POSTGRES_CONNECT_URL, --postgres_connect_url POSTGRES_CONNECT_URL
                        Postgres URL for the database to import into
  -d DEBUG, --debug DEBUG
                        Set log level to DEBUG
  --drop_tables DROP_TABLES
                        Prior to import, drop tables in the target database that have the same name as tables in the source database
  --drop_everything DROP_EVERYTHING
                        Prior to import, drop everything (tables, views, triggers, etc, etc) in the target database before the import
  --drop_tables_after_import DROP_TABLES_AFTER_IMPORT
                        Drop all tables in the target database after import; useful for testing

Examples:

Import into the bit.io database adam/AMEND, with DEBUG-level logging.

python pgsqlite.py  -f ../example_dbs/Chinook_Sqlite.sqlite -p postgresql://adam:<password>@db.bit.io/adam/AMEND --debug true

Import into the bit.io database adam/AMEND, dropping all tables in the target database that match tables in the source database:

python pgsqlite.py  -f ../example_dbs/Chinook_Sqlite.sqlite -p postgresql://adam:<password>@db.bit.io/adam/AMEND --drop_tables true

Most of the drop options are used for testing - be aware they are destructive operations!

Testing

There's a set of open-source databases in the example_dbs/ directory, and ./import_examples.sh script that will test importing of all those databases. You'll need to set POSTGRES_CREDS_STRING to your connect string before hand, and also be aware this script will drop everything in the target database, so be careful!

How This Works

For more details, read: https://innerjoin.bit.io/introducing-pgsqlite-a-pure-python-module-to-import-sqlite-databases-into-postgres-bf3940cfa19f

SQLite is far more forgiving a database then Postgree. Look at this CREATE TABLE:

CREATE TABLE Customer_Ownership(
  customer_id INTEGER NOT NULL,
  vin INTEGER NOT NULL,
  purchase_date DATE NOT NULL,
  purchase_price INTEGER NOT NULL,
  warantee_expire_date DATE,
  dealer_id INTEGER NOT NULL,
  FOREIGN KEY (customer_id) REFERENCES Customers(customer_id),
  FOREIGN KEY (vin) REFERENCES Car_Vins(vin),
  FOREIGN KEY (dealer_id) REFERENCES Dealers(dealer_id)
  PRIMARY KEY (customer_id, vin)
);

This is totally valid in SQLite and is missing a comma on the second to last line. In fact, this is what you'd get back from .schema in the sqlite command line tool.

For pgsqlite, this means we cannot use the excellent sqlglot module to transpile the schema creation SQL as the module is too strict for some sqlite databases. We need the (also excellent) sqlite-utils module. sqlite-utils gives us python objects that represent the database entities, which lets us then create Postgres-valid SQL to create these entities.

We use psycopg (version 3) to gain access to the very fast COPY protocol. We filter that incoming data to make sure we have nulls set correctly, and to do any transforms on the literal values that are required (like the BOOLEAN example in Known Issues, below).

Known Issues

Most of the issues are around constraints that involve SQL that requires literals. For example, a BOOLEAN column may have a CHECK constraint like IN (1, 0) which is valid in SQLite but not in Postgres (in SQLite the integers 1 and 0 are true/false, but not in Postgres). To fix this we'd need to parse the SQL, identify the literals and which columns they map to, then "fix" the literal's type. This also impacts views & triggers.

TODOS

  • Unit tests
  • Append mode
  • Async loading of data
    • With async, a status property that tells us, eg "x of y rows loaded in table z"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgsqlite-1.0.3.tar.gz (12.0 kB view details)

Uploaded Source

Built Distribution

pgsqlite-1.0.3-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file pgsqlite-1.0.3.tar.gz.

File metadata

  • Download URL: pgsqlite-1.0.3.tar.gz
  • Upload date:
  • Size: 12.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.11

File hashes

Hashes for pgsqlite-1.0.3.tar.gz
Algorithm Hash digest
SHA256 3e778c5266aec65e36fc67cf3dca1d84982941f58e9510a39e9d0188e9929f82
MD5 db9062907fcff3a6813b1fba02a1e973
BLAKE2b-256 968d30a3e25396777aaf63051e718755b9556413287a7dd6a283fb425e586b49

See more details on using hashes here.

File details

Details for the file pgsqlite-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: pgsqlite-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.11

File hashes

Hashes for pgsqlite-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a2d1e87a0e6fd3a90c0894d2bd13d31b9b0f0bb53b1888fb81e4ecca95cd41d3
MD5 449f99cced47ebba7679caefb8daf1b2
BLAKE2b-256 93afe22ae25d94194fde43b4d18dc7090128f1ea147c6c9592659fb634051922

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page