Download files in batches from Azure Blob Storage Containers
Project description
Azure Batch Load
High level Python wrapper around the Azure CLI to download or upload files in batches from or to Azure Blob Storage Containers. This project aims to be the missing functionality in the Python SDK of Azure Storage since there is no possibility to download or upload batches of files from or to containers. The only option in the Azure Storage Python SDK is downloading file by file, which takes a lot of time.
Besides doing loads in batches, since version 0.5.0
it's possible to set method to single
which will use the
Azure Python SDK to process files one by one.
Installation
pip install azurebatchload
See PyPi for package index.
Note: Azure CLI has to be installed and configured. Check if Azure CLI is installed through terminal:
az --version
Usage example Download:
1. Using the standard environment variables
Azure-batch-load automatically checks for environment variables: AZURE_STORAGE_CONNECTION_STRING
,
AZURE_STORAGE_KEY
and AZURE_STORAGE_ACCOUNT
.
So if the connection_string or storage_key + storage_account are set as environment variables,
we can leave the argument connection_string
, account_key
and account_name
empty:
import os
from azurebatchload import DownloadBatch
DownloadBatch(
destination='../pdfs',
source='blobcontainername',
extension='.pdf'
).download()
2. Using method="single"
We can make skip the usage of the Azure CLI
and just make use Python SDK by setting the method="single"
:
from azurebatchload import DownloadBatch
DownloadBatch(
destination='../pdfs',
source='blobcontainername',
extension='.pdf',
method='single'
)
3. Download a specific folder from a container
We can download a folder by setting the folder
argument. This works both for single
and batch
.
from azurebatchload import DownloadBatch
DownloadBatch(
destination='../pdfs',
source='blobcontainername',
folder='uploads/invoices/',
extension='.pdf',
method='single'
)
4. Using own environment variables
If we use other names for the environment variables, we can define the arguments connection_string
, account_key
and account_name
in our function:
import os
from azurebatchload import DownloadBatch
DownloadBatch(
destination='../pdfs',
source='blobcontainername',
connection_string=os.environ.get("connection_string"),
extension='.pdf'
).download()
Or with key and name:
import os
from azurebatchload import DownloadBatch
az_batch = DownloadBatch(
destination='../pdfs',
source='blobcontainername',
account_key=os.environ.get("account_key"),
account_name=os.environ.get("account_name"),
extension='.pdf'
).download()
Usage example upload:
1. Using the standard environment variables
import os
from azurebatchload import UploadBatch
UploadBatch(
destination='blobcontainername',
source='../pdf',
pattern='*.pdf'
).upload()
For more information about file pattern matching in the pattern
argument, see Python Documentation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for azurebatchload-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 826edce28342e70080729b592a49bae7bd4efd2c6a57f6e1b3edac8e394378fc |
|
MD5 | c9856b03c6c7d3b7dd1a37bc0bc788fc |
|
BLAKE2b-256 | cd609b83da3703b48b60578f16e5aa6cbebdfe08674d910708fc238871c1921e |