An implementation of tf.data.Dataset for aws Athena
Project description
Tensorflow Data for AWS Athena
An AWS athena library for tensorflow.data.Dataset
. If you don't know tf.data
, take a look at documentation and this example.
How to use
Use is almost as simple as another tf.Dataset implementation. You just need to create a dataset using the funciton create_athena_dataset
no (it follows aws authentication chain in boto3).
# imports
from tf_data_athena import create_athena_dataset
# connector parameters
s3_output_location = "s3://my-bucket/my-folder/athena-outputs" # Athena output bucket folder
waiting_interval = 0.1 # Time (in seconds) to wait before asking for query state
# query
query = "select * from my_namespace.my_table"
# create dataset
dataset = create_athena_dataset(query, s3_output_location)
Now, dataset
is an instance of tf.data.Dataset
containing query results.
Parameters
Then factory funcion create_athena_dataset
has the following parameters:
query
: The query to be ran in athenas3_output_location
: An s3 path with write access for the current account where the query results file will be savedwaiting_interval
: A float number representing the number of seconds between to wait before ask for query status on athenanum_parallel_calls
: Argument fortf.data.Dataset.map
(see documentation) while parsing result rows- other named arguments: Any other named argument will be used on
tf.data.TextLineDataset
constructor, please, see documentation.
AWS Authorization
This library uses boto3
behind the scenes, then, it follows the same authentication/authorization chain.
Authorized user or service needs permission to create and execute athena queries and create and read s3 objects in the folder defined by s3_output_location
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tf-data-athena-1.0.1.tar.gz
.
File metadata
- Download URL: tf-data-athena-1.0.1.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 500e09993a437fcd81486253380745e80953a14c7228431820dfc558c4a088e9 |
|
MD5 | de3f881446156b10700ce690e5d8d771 |
|
BLAKE2b-256 | c30075f619656927b6c0e9b8008f2a43b73b7e6e6b198e1052320f6d31381bf5 |
File details
Details for the file tf_data_athena-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: tf_data_athena-1.0.1-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3908cdb8b9a8e39c30f726e1fc618a9b14db85ec700b20e1ee43f5f543be417 |
|
MD5 | 120ef3ccab52a0916f9633ed57dab1c2 |
|
BLAKE2b-256 | 272b29bfe9e64943591fe55c53e44ca46a48217ff88d859c1edcb2150b29fd10 |