A simple and fast cli application to load a csv into postgres
Project description
csv2pg
A simple and fast cli application to load a csv into postgres
- low memory footprint
- blazing fast (pg copy)
- lines and fields validation (
--skip-error
)
Installation
https://pypi.org/project/csv2pg/
pip install --user csv2pg
Usage
$ csv2pg --help
Usage: csv2pg [OPTIONS] TABLE FILEPATH
COPY FROM 'csv' TO 'postgres'
Options:
-h, --host TEXT database server host [default: localhost]
-p, --port INTEGER database server port [default: 5432]
-d, --dbname TEXT database user name [default: $USER]
-U, --username TEXT database name to connect to [default: $USER]
-W, --password force password prompt
-v, --verbose
--progress display progress bar
--skip-error detect, ignore and export errors to
<filepath>.err [default: False]
--header / --no-header [default: True]
--rownum / --no-rownum include line number in a _rownum column
[default: False]
--filename / --no-filename include filename in a _filename column
[default: False]
--delimiter TEXT char separating the fields [default: ,]
--quotechar TEXT char used to quote a field [default: "]
--doublequote When True, escapechar is replaced by doubling
the quote char [default: False]
--escapechar TEXT char used to esapce the quote char [default: \]
--lineterminator TEXT line ending sequence [default: ]
--null TEXT will be treated as NULL by postgres [default: ]
--encoding TEXT [default: utf-8]
--overwrite destroy table before inserting csv [default:
False]
--unlogged insert in an UNLOGGED table (faster) [default:
False]
--buffer INTEGER size of the read buffer to be used by COPY FROM
[default: 8192]
--version Show the version and exit.
--help Show this message and exit.
Basic usage:
csv2pg -h localhost -p 5432 -U postgres -d postgres public.data data.csv --verbose
Basic usage with postgres environment variables:
PGHOST=localhost PGPORT=5432 PGDATABASE=postgres PGUSER=postgres PGPASSWORD= csv2pg public.data data.csv --verbose
Loading a tab delimited latin-1 encoded file in an unlogged table with _filename
and _rownum
columns, skipping errors and displaying progress bar:
PGPASSWORD= csv2pg -h localhost -p 25432 -d test -U test \
--delimiter=$'\t' --encoding="iso-8859-1" \
--overwrite --unlogged \
--filename --rownum \
--skip-error --progress \
public.data data.csv
Quick test
Start a postgres database:
docker run -d --rm \
-p 25432:5432 \
--name csv2pg-test \
-e POSTGRES_DB=test \
-e POSTGRES_USER=test \
-e POSTGRES_PASSWORD=test \
postgres
Download and import a test asset file:
wget https://raw.githubusercontent.com/DavidLacroix/csv2pg/master/tests/assets/simple.csv .
PGPASSWORD=test csv2pg -h localhost -p 25432 -U test -d test public.data simple.csv --progress
From python
import csv2pg
HOST = "localhost"
PORT = 25432
DBNAME = "test"
USER = "test"
PASSWORD = "test"
csv2pg.copy_to(HOST, PORT, DBNAME, USER, PASSWORD, "public.data", "./simple.csv", verbose=True)
Precaution
- the
--overwrite
option will drop the table before inserting the new records in. - the
--rownum
and--filename
options will slightly increase the insertion time (increase the data to write on disk) - the
--skip-error
option will slightly increase the insertion time (fields and lines validation) --verbose
and--progress
used together might spoil the console output
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
csv2pg-2.2.0.tar.gz
(21.7 kB
view hashes)
Built Distribution
csv2pg-2.2.0-py2.py3-none-any.whl
(21.2 kB
view hashes)
Close
Hashes for csv2pg-2.2.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5aef603586412ffe15d0bfb598612cbb73608deaf31f95cd85d6a1f52ae2f8c8 |
|
MD5 | 6e60cd225e3c454317868aa4c82383a2 |
|
BLAKE2b-256 | 50e28830f5135b2d0e8dd91349b5708ed1806dd6fcc42613cb38a626d4b0cf9f |