locopy·PyPI

Loading/Unloading to Amazon Redshift using Python

These details have not been verified by PyPI

Project links

Project description

https://github.com/capitalone/locopy/workflows/Python%20package/badge.svg

https://img.shields.io/badge/code%20style-black-000000.svg

locopy: Data Load and Copy using Python

A Python library to assist with ETL processing for:

Amazon Redshift (COPY, UNLOAD)
Snowflake (COPY INTO <table>, COPY INTO <location>)

In addition:

The library supports Python 3.9 to 3.11
DB Driver (Adapter) agnostic. Use your favourite driver that complies with DB-API 2.0
It provides functionality to download and upload data to S3 buckets, and internal stages (Snowflake)

Quick Installation

pip install locopy

or install from conda-forge

conda config --add channels conda-forge
conda install locopy

Installation instructions

A virtual or conda environment is highly recommended

$ virtualenv locopy
$ source locopy/bin/activate
$ pip install --upgrade setuptools pip
$ pip install locopy

Python Database API Specification 2.0

Rather than using a specific Python DB Driver / Adapter for Postgres (which should supports Amazon Redshift or Snowflake), locopy prefers to be agnostic. As an end user you can use any Python Database API Specification 2.0 package.

The following packages have been tested:

psycopg2
pg8000
snowflake-connector-python

You can use which ever one you prefer by importing the package and passing it into the constructor input dbapi.

Usage

You need to store your connection parameters in a YAML file (or pass them in directly). The YAML would consist of the following items:

# required to connect to redshift
host: my.redshift.cluster.com
port: 5439
database: db
user: userid
password: password
## optional extras for the dbapi connector
sslmode: require
another_option: 123

If you aren’t loading data, you don’t need to have AWS tokens set up. The Redshift connection (Redshift) can be used like this:

import pg8000
import locopy

with locopy.Redshift(dbapi=pg8000, config_yaml="config.yml") as redshift:
    redshift.execute("SELECT * FROM schema.table")
    df = redshift.to_dataframe()
print(df)

If you want to load data to Redshift via S3, the Redshift class inherits from S3:

import pg8000
import locopy

with locopy.Redshift(dbapi=pg8000, config_yaml="config.yml") as redshift:
    redshift.execute("SET query_group TO quick")
    redshift.execute("CREATE TABLE schema.table (variable VARCHAR(20)) DISTKEY(variable)")
    redshift.load_and_copy(
        local_file="example/example_data.csv",
        s3_bucket="my_s3_bucket",
        table_name="schema.table",
        delim=",")
    redshift.execute("SELECT * FROM schema.table")
    res = redshift.cursor.fetchall()

print(res)

If you want to download data from Redshift to a CSV, or read it into Python

my_profile = "some_profile_with_valid_tokens"
with locopy.Redshift(dbapi=pg8000, config_yaml="config.yml", profile=my_profile) as redshift:
    ##Optionally provide export if you ALSO want the exported data copied to a flat file
    redshift.unload_and_copy(
        query="SELECT * FROM schema.table",
        s3_bucket="my_s3_bucket",
        export_path="my_output_destination.csv")

Note on tokens

To load data to S3, you will need to be able to generate AWS tokens, or assume the IAM role on a EC2 instance. There are a few options for doing this, depending on where you’re running your script and how you want to handle tokens. Once you have your tokens, they need to be accessible to the AWS command line interface. See http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#config-settings-and-precedence for more information, but you can:

Populate environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, etc.
Leverage the AWS credentials file. If you have multiple profiles configured you can either call locopy.Redshift(profile="my-profile"), or set up an environment variable AWS_DEFAULT_PROFILE.
If you are on a EC2 instance you can assume the credentials associated with the IAM role attached.

Advanced Usage

See the docs for more detailed usage instructions and examples including Snowflake.

Contributors

We welcome and appreciate your contributions! Before we can accept any contributions, we ask that you please be sure to sign the Contributor License Agreement (CLA).

This project adheres to the Open Source Code of Conduct. By participating, you are expected to honor this code.

Roadmap

Roadmap details can be found here

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.6.5

May 30, 2025

0.6.4

Mar 25, 2025

0.6.3

Dec 17, 2024

0.6.2

Oct 23, 2024

0.6.1

Sep 9, 2024

0.6.0

Aug 12, 2024

0.5.9

Jun 20, 2024

0.5.8

Apr 15, 2024

0.5.7

Jan 23, 2024

0.5.6

Dec 8, 2023

0.5.5

Oct 30, 2023

0.5.4

Sep 7, 2023

0.5.3

Jul 19, 2023

0.5.2

Jun 28, 2023

0.5.1

Apr 13, 2023

0.5.0

Aug 16, 2022

0.4.1

Apr 19, 2022

0.4.0

Mar 1, 2022

0.3.8

Jan 5, 2021

0.3.7

Jun 30, 2020

0.3.6

Jan 23, 2020

0.3.5

Dec 19, 2019

0.3.4

Nov 21, 2019

0.3.3

Nov 5, 2019

0.3.2

Nov 1, 2019

0.3.1

Apr 3, 2019

0.3.0

Feb 12, 2019

0.2.0

Nov 21, 2018

0.1.1

Jul 31, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

locopy-0.6.5.tar.gz (47.6 kB view details)

Uploaded May 30, 2025 Source

Built Distribution

locopy-0.6.5-py3-none-any.whl (35.0 kB view details)

Uploaded May 30, 2025 Python 3

File details

Details for the file locopy-0.6.5.tar.gz.

File metadata

Download URL: locopy-0.6.5.tar.gz
Upload date: May 30, 2025
Size: 47.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for locopy-0.6.5.tar.gz
Algorithm	Hash digest
SHA256	`e44e19ae9b0bb8c4689138f875e116f3ea06142a5d58b7f8479faee4e0505fba`
MD5	`6eda83ef8995775429f01788094bfa1f`
BLAKE2b-256	`3c8b3e70c4d85c2cb59daaeadeb71b7d344f72371707228f7362ca5eb25b91d8`

See more details on using hashes here.

File details

Details for the file locopy-0.6.5-py3-none-any.whl.

File metadata

Download URL: locopy-0.6.5-py3-none-any.whl
Upload date: May 30, 2025
Size: 35.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for locopy-0.6.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`26e0d52e4c92c38531ef343259c55c424a59a02899c808500be94c8e92308b85`
MD5	`135268e133778d079aa4f52db60060d5`
BLAKE2b-256	`d05f384a464280ec4ee76846147b956f8c0f45009cdca45a8fff939e96aae28a`

See more details on using hashes here.

locopy 0.6.5

Navigation

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Project description

locopy: Data Load and Copy using Python

Quick Installation

Installation instructions

Python Database API Specification 2.0

Usage

Note on tokens

Advanced Usage

Contributors

Roadmap

Project details

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes