An implementation of tf.data.Dataset for aws Athena
Project description
Tensorflow Data for AWS Athena
An AWS athena library for tensorflow.data.Dataset
. If you don't know tf.data
, take a look at documentation and this example.
How to use
Use is almost as simple as another tf.Dataset implementation. You just need to create a dataset using the funciton create_athena_dataset
no (it follows aws authentication chain in boto3).
# imports
from tf_data_athena import create_athena_dataset
# connector parameters
s3_output_location = "s3://my-bucket/my-folder/athena-outputs" # Athena output bucket folder
waiting_interval = 0.1 # Time (in seconds) to wait before asking for query state
# query
query = "select * from my_namespace.my_table"
# create dataset
dataset = create_athena_dataset(query, s3_output_location)
Now, dataset
is an instance of tf.data.Dataset
containing query results.
Parameters
Then factory funcion create_athena_dataset
has the following parameters:
query
: The query to be ran in athenas3_output_location
: An s3 path with write access for the current account where the query results file will be savedwaiting_interval
: A float number representing the number of seconds between to wait before ask for query status on athenanum_parallel_calls
: Argument fortf.data.Dataset.map
(see documentation) while parsing result rows- other named arguments: Any other named argument will be used on
tf.data.TextLineDataset
constructor, please, see documentation.
AWS Authorization
This library uses boto3
behind the scenes, then, it follows the same authentication/authorization chain.
Authorized user or service needs permission to create and execute athena queries and create and read s3 objects in the folder defined by s3_output_location
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for tf_data_athena-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3908cdb8b9a8e39c30f726e1fc618a9b14db85ec700b20e1ee43f5f543be417 |
|
MD5 | 120ef3ccab52a0916f9633ed57dab1c2 |
|
BLAKE2b-256 | 272b29bfe9e64943591fe55c53e44ca46a48217ff88d859c1edcb2150b29fd10 |