Skip to main content

A Python library for upserting data into postgres.

Project description

pg_upsert

ci/cd Documentation Status PyPI Latest Release PyPI Downloads Python Version Support

pg_upsert is a Python package that provides a method to interactively update and insert (upsert) rows of a base table or base tables from the staging table(s) of the same name. The package is designed to work exclusively with PostgreSQL databases.

The program will perform initial table checks in the form of not-null, primary key, foreign key, and check constraint checks. If any of these checks fail, the program will exit with an error message. If all checks pass, the program will display the number of rows to be inserted and updated, and ask for confirmation before proceeding (when the interactive flag is set to True). If the user confirms, the program will perform the upserts and display the number of rows inserted and updated. If the user does not confirm, the program will exit without performing any upserts.

Credits

This project was created using inspiration from ExecSQL and the example script pg_upsert.sql. The goal of this project is to provide a Python implementation of pg_upsert.sql without the need for ExecSQL.

Usage

Python

import logging

from pg_upsert import PgUpsert

logger = logging.getLogger("pg_upsert")
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler())

PgUpsert(
    host="localhost",
    port=5432,
    database="postgres",
    user="<db_username>",
    tables=("genres", "books", "authors", "book_authors"),
    stg_schema="staging",
    base_schema="public",
    do_commit=True,
    upsert_method="upsert",
    interactive=True,
    exclude_cols=("rev_user", "rev_time", "created_at", "updated_at"),
    exclude_null_check_cols=("rev_user", "rev_time", "created_at", "updated_at", "alias"),
).run()

CLI

usage: pg_upsert [-h] [--version] [-q] [-d] [-l LOGFILE] [-e EXCLUDE_COLUMNS]
                 [-n NULL_COLUMNS] [-c] [-i] [-m UPSERT_METHOD]
                 HOST PORT DATABASE USER STAGING_SCHEMA BASE_SCHEMA TABLE
                 [TABLE ...]

Run not-NULL, Primary Key, Foreign Key, and Check Constraint checks on staging tables then update and insert (upsert) data from staging tables to base tables.

positional arguments:
  HOST                  database host
  PORT                  database port
  DATABASE              database name
  USER                  database user
  STAGING_SCHEMA        staging schema name
  BASE_SCHEMA           base schema name
  TABLE                 table name(s)

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -q, --quiet           suppress all console output
  -d, --debug           display debug output
  -l LOGFILE, --log LOGFILE
                        write log to LOGFILE
  -e EXCLUDE_COLUMNS, --exclude EXCLUDE_COLUMNS
                        comma-separated list of columns to exclude from null
                        checks
  -n NULL_COLUMNS, --null NULL_COLUMNS
                        comma-separated list of columns to exclude from null
                        checks
  -c, --do-commit       commit changes to database
  -i, --interactive     display interactive GUI of important table information
  -m UPSERT_METHOD, --upsert-method UPSERT_METHOD
                        method to use for upsert

Docker

docker pull ghcr.io/geocoug/pg_upsert:latest

Contributing

  1. Fork the repository
  2. Create a new branch (git checkout -b feature-branch)
  3. Create a Python virtual environment (python -m venv .venv)
  4. Activate the virtual environment (source .venv/bin/activate)
  5. Install dependencies (pip install -r requirements.txt)
  6. Install pre-commit hooks (python -m pre-commit install --install-hooks)
  7. Make your changes and run tests (make test)
  8. Push your changes to the branch (git push origin feature-branch)
  9. Create a pull request

Running Tests Locally

Running tests locally requires a PostgreSQL database. The easiest way to set up a PostgreSQL database is to use Docker. The following command will create a PostgreSQL database called dev with the user docker and password docker.

docker run --name postgres -e POSTGRES_USER=docker -e POSTGRES_PASSWORD=docker -e POSTGRES_DB=dev -p 5432:5432 -d postgres

Once initialized, import the test data by running the following command.

docker exec -i postgres psql -U docker -d dev < tests/data.sql

Create a .env file in the root directory with the following content, modifying the values as needed.

POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_DB=dev
POSTGRES_USER=docker
POSTGRES_PASSWORD=docker

Now you can run the tests using make test.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pg_upsert-1.2.8.tar.gz (65.0 kB view details)

Uploaded Source

Built Distribution

pg_upsert-1.2.8-py3-none-any.whl (49.5 kB view details)

Uploaded Python 3

File details

Details for the file pg_upsert-1.2.8.tar.gz.

File metadata

  • Download URL: pg_upsert-1.2.8.tar.gz
  • Upload date:
  • Size: 65.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for pg_upsert-1.2.8.tar.gz
Algorithm Hash digest
SHA256 be154a17a5ef5860d9b2cc02b7da36cde5952530f1e24c4286fcfcb5f7599c4e
MD5 3707dcf50783b01f8ea6a1a59f5038a5
BLAKE2b-256 fe9e1ca5fc9dbafcaf88dd5e881b34f2083571a5127762064a1b1aa0bc41b228

See more details on using hashes here.

File details

Details for the file pg_upsert-1.2.8-py3-none-any.whl.

File metadata

  • Download URL: pg_upsert-1.2.8-py3-none-any.whl
  • Upload date:
  • Size: 49.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for pg_upsert-1.2.8-py3-none-any.whl
Algorithm Hash digest
SHA256 33f65bcfd58efe7ec6a4c7fef3d315381ae9d1f2bb1a785541b4aab6546adae7
MD5 7bd8c2d54af08f5c8bec1c7571e69f45
BLAKE2b-256 74ecb62c25f5af05c175e641890eaa332a7cb5d9e616f791fdc6a160c56651da

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page