Skip to main content

It is a python library to run the presto query on the AWS Athena.

Project description

ZAPR AWS Athena Client

ZAPR AWS athena client is a python library to run the presto query on the AWS Athena.

At Zapr we have the largest repository of offline media consumption and we try to answer some of the hardest questions of brands, broadcasters and marketers, on top of this data. To make all this happen we have churn TBs of data in a somewhat interactive manner. AWS Athena comes to rescue to help the team achieve this. We are using this client as a middleware to submit the queries to Athena, as Athena has few shortcomings that we have tried to solve through this client. Athena lacks in :

1. Submitting multiple queries at a time.
2. Insert overwrite is not supported in Athena.
3. Dropping of table doesn't delete the data, only schema is dropped.

Another benefit that we achieve using this client is that we can integrate Athena easily to all our existing data pipelines built on oozie, airflow.

Supported Features

  • submit the multiple queries from single file.
  • insert overwrite.
  • drop table (drop the table and delete the data as well).
  • submitting the query by using aws athena workgroup. so we can track the cost of the query.

Quick Start

Prerequisite

  • boto3
  • configparser

Usage

Syntax

python athena_client.py config_file_location workgroup_name query_file_location  input_macro1 input_macro2 ...

Install dependencies

pip install -r requirements.txt

Example - 1

python athena_client.py config.ini workgroup_testing_team sample-query-file.sql start_date=2020-09-25 end_date=2020-09-25

Example - 2

python athena_client.py s3://sampe-bucket/sample-prefix/project-1/config.ini workgroup_testing_team s3://sampe-bucket/sample-prefix/project-1/sample-query-file.sql start_date=2020-09-25 end_date=2020-09-25

Via PIP

pip install zapr-athena-client
zapr-athena-client config.ini workgroup_testing_team sample-query-file.sql start_date=2020-09-25 end_date=2020-09-25

Sample Query

create table sample_db.${table_prefix}_username
WITH (external_location = 's3://sample_db/${table_prefix}_username/',format = 'ORC') as
    select username
    from raw_db.users
    where date between '${start_date}' and '${end_date}';

Disable Insert Overwrite and drop data

This athena client supports insert overwrite table and delete data if you are executing drop table query by default. We can add the following configurations to disable these features.

ENABLE_INSERT_OVERWRITE = False
ENABLE_EXTERNAL_TABLE_DROP = False

Contact

For any features or bugs, please raise it in issues section

If anything else, get in touch with us at opensource@zapr.in

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zapr-athena-client-0.1.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

zapr_athena_client-0.1-py2-none-any.whl (18.6 kB view details)

Uploaded Python 2

File details

Details for the file zapr-athena-client-0.1.tar.gz.

File metadata

  • Download URL: zapr-athena-client-0.1.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/2.7.16

File hashes

Hashes for zapr-athena-client-0.1.tar.gz
Algorithm Hash digest
SHA256 491520b3d7ca1d2ffa00b994aa5d4c0d44fe78ea9877345b5e28ec6202d774c8
MD5 b16378e98dbc48f9c5e84bcf13bc5205
BLAKE2b-256 ddff3557b95c3fc23bf46e220afad3722636c63c3481480bb9ff59493e3c3a23

See more details on using hashes here.

File details

Details for the file zapr_athena_client-0.1-py2-none-any.whl.

File metadata

  • Download URL: zapr_athena_client-0.1-py2-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/44.1.1 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/2.7.16

File hashes

Hashes for zapr_athena_client-0.1-py2-none-any.whl
Algorithm Hash digest
SHA256 2fbde61d833949323e89510665394fcf22ad6f0a42d5f249468f6cd1c5277064
MD5 032e0d662800e0b1e64146184647dfe4
BLAKE2b-256 60eeca8bc62ce79c31a6227af95804ddcef860c11a247922b82c38cc7b9997aa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page