Skip to main content

Codenize your data sources

Project description

https://img.shields.io/pypi/v/akagi.svg https://img.shields.io/travis/ayemos/akagi.svg https://readthedocs.org/projects/akagi/badge/?version=latest https://pyup.io/repos/github/ayemos/akagi/shield.svg https://codeclimate.com/github/ayemos/akagi/badges/coverage.svg

akagi

  • Free software: MIT license

Features

akagi enables you to access various data sources such as Amazon Redshift, Amazon S3 and Google Spreadsheet (more in future) from python.

Installation

Install via pip:

pip install akagi

or from source:

$ git clone https://github.com/ayemos/akagi akagi
$ cd akagi
$ python setup.py install

Setup

To use RedshiftDataSource, you need to set environment variable AKAGI_UNLOAD_BUCKET the name of the Amazon S3 bucket you like to use as intermediate storage of Redshift Unload command.

$ export AKAGI_UNLOAD_BUCKET=xyz-unload-bucket.ap-northeast-1

To use SpreadsheetDetaSource, you need to set environment variable GOOGLE_APPLICATION_CREDENTIAL to indicate your service account credentials file. You can get the credential from here.

Associated client has to have read access to the sheets.

$ export GOOGLE_APPLICATION_CREDENTIAL=$HOME/.credentials/service-1a2b.json

Example

RedshiftDataSource

from akagi.data_sources import RedshiftDataSource

ds = RedshiftDataSource('select * from (select user_id, path from logs.imp limit 10000')

for d in ds:
    print(d) # iterate on result

S3DataSource

from akagi.data_sources import S3DataSource

ds = S3DataSource.for_prefix(
        'image-data.ap-northeast-1',
        'data/image_net/zebra',
        file_format='binary')

for d in ds:
    print(d) # iterate on result

SpreadsheetDataSource

from akagi.data_sources import LocalDataSource

ds = SpreadsheetDataSource(
      '1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms',  # sample sheet provided by Google
      sheet_range='Class Data!A2:F31')

for d in ds:
    print(d) # iterate on result

LocalDataSource

from akagi.data_sources import LocalDataSource

ds = LocalDataSource(
      './PATH/TO/YOUR/DATA/DIR',
      file_format='csv')

for d in ds:
    print(d) # iterate on result

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

akagi-0.4.1-py2.py3-none-any.whl (15.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file akagi-0.4.1-py2.py3-none-any.whl.

File metadata

  • Download URL: akagi-0.4.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.3

File hashes

Hashes for akagi-0.4.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e8a2abc5abfac83bf834c8cbc31b753c3cbd87eb5ed9f2cc796a535fd770c7fc
MD5 f5c2eb80b4b66d7acc27d61106667a9c
BLAKE2b-256 28fbd918f1f5c3db0f50eb24badb95c478041027baa2ab0d8c675459c018756b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page