A tool for exporting Pandas dataframes to Redshift tables
Reason this release was yanked:
Do not use this version, it doesn't work
Project description
Pandas2Redshift
This is a utility library for uploading a Pandas DataFrame to Amazon Redshift table, utilizing AWS S3 for temporary storage.
Features
- Upload a Pandas DataFrame to a Redshift Table
- Uses the
COPY
command, using S3 as a middleware for fast inserts on Redshift - Can create the table for you based on a Dict containing the datatypes or generates it automatically based on the pandas datatypes of the dataframe
Installation
Install the package using pip:
pip install pandas2redshift
Usage
Insert Data into Redshift
Insert data from a DataFrame into a Redshift table:
import pandas as pd
from sqlalchemy import create_engine
import pandas2redshift as p2r
data = pd.DataFrame({'col1': [1, 2], 'col2': ['a', 'b']})
engine = create_engine('redshift+psycopg2://user:password@host:port/dbname')
with engine.connect() as conn:
p2r.insert(
data=data,
table_name='my_table',
schema='public',
conn=conn,
aws_access_key='YOUR_AWS_ACCESS_KEY',
aws_secret_key='YOUR_AWS_SECRET_KEY',
aws_bucket_name='YOUR_S3_BUCKET_NAME',
)
You can enhance the functionality of the insert
function with several optional arguments:
ensure_exists (bool, optional)
: Checks if the schema and table you are inserting data into exist in the database. If they do not exist, it creates them. Defaults toFalse
.truncate_table (bool, optional)
: When set toTrue
, truncates the target table before inserting the data. Defaults to False.table_data_types (Dict[str, str], optional)
: A dictionary specifying column names and their data types for table creation. If not provided, it infers the data types based on pandas dtypes and the mapping defined in thepandas_to_redshift_datatypes
function. Defaults toNone
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pandas2redshift-0.0.3.tar.gz
(4.2 kB
view details)
Built Distribution
File details
Details for the file pandas2redshift-0.0.3.tar.gz
.
File metadata
- Download URL: pandas2redshift-0.0.3.tar.gz
- Upload date:
- Size: 4.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d55f1bfcb527771c639afa7a58dcbae845da26f4046c9a5bbc59f0365ffe1e9 |
|
MD5 | 82e478b6e355309ba475ef3a6a228490 |
|
BLAKE2b-256 | 57e084c5f9cf1676491490f6604194c33658c9c53132abc32cd94867c44a943e |
File details
Details for the file pandas2redshift-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: pandas2redshift-0.0.3-py3-none-any.whl
- Upload date:
- Size: 4.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d91b5aa5765d71747411f04da0098f334d2c4d5b03b08c19b2e849fb4976e56f |
|
MD5 | a0b7c26ab0a9505ea4dc70ce46f926e3 |
|
BLAKE2b-256 | 0fd525642182c3b70698329705c5006f57f0a2b2a25a6fd08d5f7047dc641e85 |