Python DB API 2.0 (PEP 249) compliant wrapper for Amazon Athena JDBC driver
Project description
PyAthenaJDBC
PyAthenaJDBC is a python DB API 2.0 (PEP 249) compliant wrapper for Amazon Athena JDBC driver.
Requirements
Python
CPython 2.6, 2,7, 3,4, 3.5
Java
Java 7+
Installation
$ pip install PyAthenaJDBC
Usage
Basic usage:
from pyathenajdbc import connect
conn = connect(s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/')
try:
with conn.cursor() as cursor:
cursor.execute("""
SELECT * FROM one_row
""")
print(cursor.description)
print(cursor.fetchall())
finally:
conn.close()
Cursor iteration:
from pyathenajdbc import connect
conn = connect(s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/')
try:
with conn.cursor() as cursor:
cursor.execute("""
SELECT * FROM many_rows LIMIT 10
""")
for row in cursor:
print(row)
finally:
conn.close()
Query with parameter:
from pyathenajdbc import connect
conn = connect(s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/')
try:
with conn.cursor() as cursor:
cursor.execute("""
SELECT col_int FROM one_row_complex where col_int = {0}
""", 2147483647)
print(cursor.fetchall())
cursor.execute("""
SELECT col_string FROM one_row_complex where col_string = {param}
""", param='a string')
print(cursor.fetchall())
finally:
conn.close()
Minimal example for Pandas DataFrame:
from pyathenajdbc import connect
import pandas as pd
conn = connect(access_key=<access key>,
secret_key=<secret key>,
s3_staging_dir=<staging dir>,
region_name=<region name>,
jvm_path=<jvm path>) # optional, as used by jpype
df = pd.read_sql("SELECT * FROM <table name> LIMIT 10", conn)
As Pandas DataFrame:
import contextlib
from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
with contextlib.closing(
connect(s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/')) as conn:
with conn.cursor() as cursor:
cursor.execute("""
SELECT * FROM many_rows
""")
df = as_pandas(cursor)
print(df.describe())
Credential
Support AWS CLI credentials configuration.
Credential Files
~/.aws/credentials
[default]
aws_access_key_id=YOUR_ACCESS_KEY_ID
aws_secret_access_key=YOUR_SECRET_ACCESS_KEY
~/.aws/config
[default]
region=us-west-2
output=json
Environment variables
$ export AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY_ID
$ export AWS_SECRET_ACCESS_KEY=YOUR_SECRET_ACCESS_KEY
$ export AWS_DEFAULT_REGION=us-west-2
Additional environment variable:
$ export AWS_ATHENA_S3_STAGING_DIR=s3://YOUR_S3_BUCKET/path/to/
Testing
Depends on the following environment variables:
$ export AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY_ID
$ export AWS_SECRET_ACCESS_KEY=YOUR_SECRET_ACCESS_KEY
$ export AWS_DEFAULT_REGION=us-west-2
$ export AWS_ATHENA_S3_STAGING_DIR=s3://YOUR_S3_BUCKET/path/to/
Run test:
$ pip install pytest awscli
$ scripts/upload_test_data.sh
$ py.test
$ scripts/delete_test_data.sh
Run test multiple Python versions:
$ pip install tox awscli
$ scripts/upload_test_data.sh
$ pyenv local 2.6.9 2.7.12 3.4.5 3.5.2
$ tox
$ scripts/delete_test_data.sh
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
PyAthenaJDBC-1.0.6.tar.gz
(10.7 kB
view hashes)
Built Distribution
Close
Hashes for PyAthenaJDBC-1.0.6-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 978b0592120d527819e2212cdfa5ef4b5c08b7ef698f1359a4f3defdc4308eea |
|
MD5 | e198e53edfc2269ce0eabbb391f134d3 |
|
BLAKE2b-256 | ae2d88027338dd70de6519ba956c146bdafe831f1df6b5a3097b60a33dcf70e2 |