Codenize your data sources
Project description
akagi
Free software: MIT license
Features
akagi supports iter and save interface for various data sources such as Amazon Redshift, Amazon S3 (more in future).
Installation
Install via pip:
pip install akagi
or from source:
$ git clone https://github.com/ayemos/akagi akagi $ cd akagi $ python setup.py install
Setup
When using RedshiftDataSource, you need to set environment variable AKAGI_UNLOAD_BUCKET the name of the Amazon S3 bucket you like to use as intermediate storage of Redshift Unload command.
$ export AKAGI_UNLOAD_BUCKET=xyz-unload-bucket.ap-northeast-1
Example
MySQLDataSource
from akagi.data_sources import MySQLDataSource
MySQLDataSource.for_query('select distinct a.user_id from articles a', # Your Query here
{
'host': '127.0.0.1',
'user': 'analytics_readonly',
'password': os.environ['DB_PASSWORD'],
'db': 'main' # DB config (optional)
}) as ds:
for d in ds:
print(d) # iterate on result
RedshiftDataSource
from akagi.data_sources import RedshiftDataSource
ds = RedshiftDataSource.for_query(
'select * from (select user_id, path from logs.imp limit 10000)', # Your Query here
)
for d in ds:
print(d) # iterate on result
S3DataSource
from akagi.data_sources import S3DataSource
ds = S3DataSource.for_prefix(
'image-data.ap-northeast-1',
'data/image_net/zebra',
'binary')
...
LocalDataSource
from akagi.data_sources import LocalDataSource
with LocalDataSource.for_path(
'./PATH/TO/YOUR/DATA/DIR',
'csv') as ds:
ds.save('./akagi_test') # save results to local
for d in ds:
print(d) # iterate on result
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
akagi-0.1.21b2.tar.gz
(9.0 kB
view hashes)
Built Distribution
Close
Hashes for akagi-0.1.21b2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7965e1aadf9088660dd9384a638f853a80727776ebe42386ff6b3740c9bde4d |
|
MD5 | 3a3621c29c282d071e11b6e27fc0a665 |
|
BLAKE2b-256 | b35ede6ad3529021544a117d36293399c34ff98ac059950a809ecb2d652cb76f |