Python DB API 2.0 (PEP 249) compliant wrapper for Amazon Athena JDBC driver
Project description
PyAthenaJDBC
PyAthenaJDBC is a Python DB API 2.0 (PEP 249) compliant wrapper for Amazon Athena JDBC driver.
Requirements
Python
CPython 2.6, 2,7, 3,4, 3.5
Java
Java >= 7
Installation
$ pip install PyAthenaJDBC
Usage
Basic usage
from pyathenajdbc import connect
conn = connect(s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/',
region_name='us-west-2')
try:
with conn.cursor() as cursor:
cursor.execute("""
SELECT * FROM one_row
""")
print(cursor.description)
print(cursor.fetchall())
finally:
conn.close()
Cursor iteration
from pyathenajdbc import connect
conn = connect(s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/',
region_name='us-west-2')
try:
with conn.cursor() as cursor:
cursor.execute("""
SELECT * FROM many_rows LIMIT 10
""")
for row in cursor:
print(row)
finally:
conn.close()
Query with parameter
from pyathenajdbc import connect
conn = connect(s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/',
region_name='us-west-2')
try:
with conn.cursor() as cursor:
cursor.execute("""
SELECT col_int FROM one_row_complex where col_int = {0}
""", 2147483647)
print(cursor.fetchall())
cursor.execute("""
SELECT col_string FROM one_row_complex where col_string = {param}
""", param='a string')
print(cursor.fetchall())
finally:
conn.close()
Minimal example for Pandas DataFrame
from pyathenajdbc import connect
import pandas as pd
conn = connect(access_key='YOUR_ACCESS_KEY_ID',
secret_key='YOUR_SECRET_ACCESS_KEY',
s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/',
region_name='us-west-2',
jvm_path='/path/to/jvm') # optional, as used by JPype
df = pd.read_sql("SELECT * FROM many_rows LIMIT 10", conn)
As Pandas DataFrame
import contextlib
from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
with contextlib.closing(
connect(s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/'
region_name='us-west-2'))) as conn:
with conn.cursor() as cursor:
cursor.execute("""
SELECT * FROM many_rows
""")
df = as_pandas(cursor)
print(df.describe())
Examples
Redash query runner example
Credential
Support AWS CLI credentials configuration.
Credential Files
~/.aws/credentials
[default]
aws_access_key_id=YOUR_ACCESS_KEY_ID
aws_secret_access_key=YOUR_SECRET_ACCESS_KEY
~/.aws/config
[default]
region=us-west-2
output=json
Environment variables
$ export AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY_ID
$ export AWS_SECRET_ACCESS_KEY=YOUR_SECRET_ACCESS_KEY
$ export AWS_DEFAULT_REGION=us-west-2
Additional environment variable:
$ export AWS_ATHENA_S3_STAGING_DIR=s3://YOUR_S3_BUCKET/path/to/
Testing
Depends on the AWS CLI credentials and the following environment variables:
~/.aws/credentials
[default]
aws_access_key_id=YOUR_ACCESS_KEY_ID
aws_secret_access_key=YOUR_SECRET_ACCESS_KEY
Environment variables
$ export AWS_DEFAULT_REGION=us-west-2
$ export AWS_ATHENA_S3_STAGING_DIR=s3://YOUR_S3_BUCKET/path/to/
Run test
$ pip install pytest awscli
$ scripts/upload_test_data.sh
$ py.test
$ scripts/delete_test_data.sh
Run test multiple Python versions
$ pip install tox awscli
$ scripts/upload_test_data.sh
$ pyenv local 2.6.9 2.7.12 3.4.5 3.5.2
$ tox
$ scripts/delete_test_data.sh
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
PyAthenaJDBC-1.0.8.tar.gz
(11.1 kB
view hashes)
Built Distribution
Close
Hashes for PyAthenaJDBC-1.0.8-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a051f8d533f3410bab2a5564d3f894fb92e4271e0aced123a8ad1198f583077 |
|
MD5 | f680717c8158372c773b0ddd1c6149ba |
|
BLAKE2b-256 | da0175fdd69eef5858e91fad07c778444d56c25fc7b89f0d76a9b4c9b0415ae7 |