Skip to main content

Azure Blob Storage Backend for Dask

Project description

Dask Azure Blob FileSystem

Azure Blob Storage Backend for Dask

https://travis-ci.org/manish/dask-azureblobfs.svg?branch=master Documentation Status

Features

  • Supports dask when your data files are stored in the cloud.
    • Import DaskAzureBlobFileSystem
    • Use abfs:// as protocol prefix and you are good to do.
  • For authentication, please read more on Usage.
  • Support for key-value storage which is backed by azure storage. Create an instance of AzureBlobMap

Usage

Make the right imports:

from azureblobfs.dask import DaskAzureBlobFileSystem
import dask.dataframe as dd

then put all data files in an azure storage container say clippy, then you can read it:

data = dd.read_csv("abfs://noaa/clippy/weather*.csv")
max_by_state = data.groupby("states").max().compute()

you would need to set your azure account name in environment variable AZURE_BLOB_ACCOUNT_NAME (which in our above example is noaa) and the account key in AZURE_BLOB_ACCOUNT_KEY.

If you don’t want to use account key and instead want to use SAS, set it in the environment variable AZURE_BLOB_SAS_TOKEN along with the connection string in the environment variable AZURE_BLOB_CONNECTION_STRING.

Installation

Just:

pip install dask-azureblobfs

or get the development version if you love to live dangerously:

pip install git+https://github.com/manish/dask-azureblobfs@master#egg=dask-azureblobfs

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.1.0 (2018-11-18)

  • First release on PyPI.

Project details


Release history Release notifications

This version
History Node

0.1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
dask_azureblobfs-0.1.0-py3-none-any.whl (9.3 kB) Copy SHA256 hash SHA256 Wheel py3
dask-azureblobfs-0.1.0.tar.gz (4.6 MB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page