This library contains several functions that allow you to migrate data from a CSV file or Pandas Dataframe into a PostgreSQL database using the libraries Psycopg2 and Pandas
Project description
Description
This library contains several functions that allow you to migrate data from a CSV file or Pandas Dataframe into a PostgreSQL database using the libraries Psycopg2 and Pandas. Specifically, it includes functions for:
- Loading data from a CSV file to a Pandas dataframe
- Mapping Pandas Dataframe columns to their datatypes
- Mapping Pandas Dataframe columns to suitable PostgreSQL datatypes
- Connect to a PostgreSQL database
- Creating new tables on a PostgreSQL database
- Inserting data from a Pandas DataFrame into a table in a PostgreSQL database
Installation
Install The Required Dependencies
pip install pandas psycopg2
Install The pandaspg
Package
pip install pandaspg
Usage
I will walk you through a step-by-step example of how to migrate data from a CSV
file into a PostgreSQL
database. For demonstration purposes, download the exoplanets_07-04-28.csv
file from my [exoplanets](https://github.com/eadwulf/exoplanets)
repo.
Download The CSV file
wget https://raw.githubusercontent.com/Eadwulf/exoplanets/main/exoplanets_07-04-2023.csv
Import The Library
import pandaspg
Create a Pandas dataframe
with the CSV
file data
dataframe = pandaspg.csv_to_dataframe('exoplanets_04-07-2023.csv')
Generate a dictionary mapping the dataframe columns to their datatype
column_datatypes_dict = pandaspg.get_dataframe_column_dtypes_dict(dataframe)
Generate a dictionary mapping the dataframe columns to a suitable PostgreSQL datatype
pg_column_datatypes_dict = pandaspg.map_pandas_to_postgresql_datatypes(
column_datatypes_dict)
Connect to an existing and running PostgreSQL database
connection = pandaspg.connect_to_postgresql(database='analysis',
user='postgres',
password='postgres',
host='localhost',
port=5432)
Create a PostgreSQL table
pandaspg.create_postgresql_table(
connection, 'exoplanets_csv', pg_column_datatypes_dict)
Insert the from the dataframe to the recently created table
pandaspg.insert_dataframe_into_postgresql(
connection, 'exoplanets_csv', dataframe)
Close the connection with the database
connection.close()
The Full Example
import pandaspg
dataframe = pandaspg.csv_to_dataframe('exoplanets_04-07-2023.csv')
column_datatypes_dict = pandaspg.get_dataframe_column_dtypes_dict(dataframe)
pg_column_datatypes_dict = pandaspg.map_pandas_to_postgresql_datatypes(
column_datatypes_dict)
connection = pandaspg.connect_to_postgresql(database='analysis',
user='postgres',
password='postgres',
host='localhost',
port=5432)
pandaspg.create_postgresql_table(
connection, 'exoplanets_csv', pg_column_datatypes_dict)
pandaspg.insert_dataframe_into_postgresql(
connection, 'exoplanets_csv', dataframe)
connection.close()
Inspect The Results
Enter the PostgreSQL prompt
psql -U postgres -d analysis
List the tables in the analysis
database
\dt
Retrieve the data from the exoplanets_csv
table
SELECT * FROM exoplanets_csv;
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pandaspg-0.0.3.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b112cf0e001b5285e56d0f70ddecf82d8ed3a0cb95dee0a1cf01d0351767baad |
|
MD5 | bb0d5c9de43b1b258b88d45dba3de1c0 |
|
BLAKE2b-256 | 9b690e39c68fe7839354c044b0b9aad4121cecb64869b373500e4a7230d44e2e |