Skip to main content

A simple and fast cli application to load a csv into postgres

Project description

csv2pg

A simple and fast cli application to load a csv into postgres

  • low memory footprint
  • blazing fast (pg copy)
  • lines and fields validation (--skip-error)

Installation

https://pypi.org/project/csv2pg/

pip install --user csv2pg

Usage

$ csv2pg --help
Usage: csv2pg [OPTIONS] TABLE FILEPATH

  COPY FROM 'csv' TO 'postgres'

Options:
  -h, --host TEXT             database server host  [default: localhost]
  -p, --port INTEGER          database server port  [default: 5432]
  -d, --dbname TEXT           database user name  [default: $USER]
  -U, --username TEXT         database name to connect to  [default: $USER]
  -W, --password              force password prompt
  -v, --verbose
  --progress                  display progress bar
  --skip-error                detect, ignore and export errors to
                              <filepath>.err  [default: False]

  --header / --no-header      [default: True]
  --rownum / --no-rownum      include line number in a _rownum column
                              [default: False]

  --filename / --no-filename  include filename in a _filename column
                              [default: False]

  --delimiter TEXT            char separating the fields  [default: ,]
  --quotechar TEXT            char used to quote a field  [default: "]
  --doublequote               When True, escapechar is replaced by doubling
                              the quote char  [default: False]

  --escapechar TEXT           char used to esapce the quote char  [default: \]
  --lineterminator TEXT       line ending sequence  [default:  ]
  --null TEXT                 will be treated as NULL by postgres  [default: ]
  --encoding TEXT             [default: utf-8]
  --overwrite                 destroy table before inserting csv  [default:
                              False]

  --unlogged                  insert in an UNLOGGED table (faster)  [default:
                              False]

  --buffer INTEGER            size of the read buffer to be used by COPY FROM
                              [default: 8192]

  --version                   Show the version and exit.
  --help                      Show this message and exit.

Basic usage:

csv2pg -h localhost -p 5432 -U postgres -d postgres public.data data.csv --verbose

Basic usage with postgres environment variables:

PGHOST=localhost PGPORT=5432 PGDATABASE=postgres PGUSER=postgres PGPASSWORD= csv2pg public.data data.csv --verbose

Loading a tab delimited latin-1 encoded file in an unlogged table with _filename and _rownum columns, skipping errors and displaying progress bar:

PGPASSWORD= csv2pg -h localhost -p 25432 -d test -U test \
    --delimiter=$'\t' --encoding="iso-8859-1" \
    --overwrite --unlogged \
    --filename --rownum \
    --skip-error --progress \
public.data data.csv

Quick test

Start a postgres database:

docker run -d --rm \
    -p 25432:5432 \
    --name csv2pg-test \
    -e POSTGRES_DB=test \
    -e POSTGRES_USER=test \
    -e POSTGRES_PASSWORD=test \
postgres

Download and import a test asset file:

wget https://raw.githubusercontent.com/DavidLacroix/csv2pg/master/tests/assets/simple.csv .
PGPASSWORD=test csv2pg -h localhost -p 25432 -U test -d test public.data simple.csv --progress

From python

import csv2pg

HOST = "localhost"
PORT = 25432
DBNAME = "test"
USER = "test"
PASSWORD = "test"

csv2pg.copy_to(HOST, PORT, DBNAME, USER, PASSWORD, "public.data", "./simple.csv", verbose=True)

Precaution

  • the --overwrite option will drop the table before inserting the new records in.
  • the --rownum and --filename options will slightly increase the insertion time (increase the data to write on disk)
  • the --skip-error option will slightly increase the insertion time (fields and lines validation)
  • --verbose and --progress used together might spoil the console output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv2pg-2.2.0.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csv2pg-2.2.0-py2.py3-none-any.whl (21.2 kB view details)

Uploaded Python 2Python 3

File details

Details for the file csv2pg-2.2.0.tar.gz.

File metadata

  • Download URL: csv2pg-2.2.0.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.6

File hashes

Hashes for csv2pg-2.2.0.tar.gz
Algorithm Hash digest
SHA256 d0f94fa57a382c67bdf41d678334d877d4ad8a5a6f1e336e4502076dc9fe1548
MD5 2260f6163b93a9d4aeea92bd28011734
BLAKE2b-256 15879883d0b23a10e600bde7ee7ed75dfcd264c462c867ead2731b75e52c9a18

See more details on using hashes here.

File details

Details for the file csv2pg-2.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: csv2pg-2.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.6

File hashes

Hashes for csv2pg-2.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5aef603586412ffe15d0bfb598612cbb73608deaf31f95cd85d6a1f52ae2f8c8
MD5 6e60cd225e3c454317868aa4c82383a2
BLAKE2b-256 50e28830f5135b2d0e8dd91349b5708ed1806dd6fcc42613cb38a626d4b0cf9f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page