Microsoft Azure Blob Storage Client Library for Python
Project description
Azure Storage Blobs client library for Python
Azure Blob storage is Microsoft's object storage solution for the cloud. Blob storage is optimized for storing massive amounts of unstructured data, such as text or binary data.
Blob storage is ideal for:
- Serving images or documents directly to a browser
- Storing files for distributed access
- Streaming video and audio
- Storing data for backup and restore, disaster recovery, and archiving
- Storing data for analysis by an on-premises or Azure-hosted service
Source code | Package (PyPI) | Package (Conda) | API reference documentation | Product documentation | Samples
Getting started
Prerequisites
- Python 3.8 or later is required to use this package. For more details, please read our page on Azure SDK for Python version support policy.
- You must have an Azure subscription and an Azure storage account to use this package.
Install the package
Install the Azure Storage Blobs client library for Python with pip:
pip install azure-storage-blob
Create a storage account
If you wish to create a new storage account, you can use the Azure Portal, Azure PowerShell, or Azure CLI:
# Create a new resource group to hold the storage account -
# if using an existing resource group, skip this step
az group create --name my-resource-group --location westus2
# Create the storage account
az storage account create -n my-storage-account-name -g my-resource-group
Create the client
The Azure Storage Blobs client library for Python allows you to interact with three types of resources: the storage account itself, blob storage containers, and blobs. Interaction with these resources starts with an instance of a client. To create a client object, you will need the storage account's blob service account URL and a credential that allows you to access the storage account:
from azure.storage.blob import BlobServiceClient
service = BlobServiceClient(account_url="https://<my-storage-account-name>.blob.core.windows.net/", credential=credential)
Looking up the account URL
You can find the storage account's blob service URL using the Azure Portal, Azure PowerShell, or Azure CLI:
# Get the blob service account url for the storage account
az storage account show -n my-storage-account-name -g my-resource-group --query "primaryEndpoints.blob"
Types of credentials
The credential
parameter may be provided in a number of different forms, depending on the type of
authorization you wish to use:
-
To use an Azure Active Directory (AAD) token credential, provide an instance of the desired credential type obtained from the azure-identity library. For example, DefaultAzureCredential can be used to authenticate the client.
This requires some initial setup:
- Install azure-identity
- Register a new AAD application and give permissions to access Azure Storage
- Grant access to Azure Blob data with RBAC in the Azure Portal
- Set the values of the client ID, tenant ID, and client secret of the AAD application as environment variables: AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET
Use the returned token credential to authenticate the client:
from azure.identity import DefaultAzureCredential from azure.storage.blob import BlobServiceClient token_credential = DefaultAzureCredential() blob_service_client = BlobServiceClient( account_url="https://<my_account_name>.blob.core.windows.net", credential=token_credential )
-
To use a shared access signature (SAS) token, provide the token as a string. If your account URL includes the SAS token, omit the credential parameter. You can generate a SAS token from the Azure Portal under "Shared access signature" or use one of the
generate_sas()
functions to create a sas token for the storage account, container, or blob:from datetime import datetime, timedelta from azure.storage.blob import BlobServiceClient, generate_account_sas, ResourceTypes, AccountSasPermissions sas_token = generate_account_sas( account_name="<storage-account-name>", account_key="<account-access-key>", resource_types=ResourceTypes(service=True), permission=AccountSasPermissions(read=True), expiry=datetime.utcnow() + timedelta(hours=1) ) blob_service_client = BlobServiceClient(account_url="https://<my_account_name>.blob.core.windows.net", credential=sas_token)
-
To use a storage account shared key (aka account key or access key), provide the key as a string. This can be found in the Azure Portal under the "Access Keys" section or by running the following Azure CLI command:
az storage account keys list -g MyResourceGroup -n MyStorageAccount
Use the key as the credential parameter to authenticate the client:
from azure.storage.blob import BlobServiceClient service = BlobServiceClient(account_url="https://<my_account_name>.blob.core.windows.net", credential="<account_access_key>")
If you are using customized url (which means the url is not in this format
<my_account_name>.blob.core.windows.net
), please instantiate the client using the credential below:from azure.storage.blob import BlobServiceClient service = BlobServiceClient(account_url="https://<my_account_name>.blob.core.windows.net", credential={"account_name": "<your_account_name>", "account_key":"<account_access_key>"})
-
To use anonymous public read access, simply omit the credential parameter.
Creating the client from a connection string
Depending on your use case and authorization method, you may prefer to initialize a client instance with a storage
connection string instead of providing the account URL and credential separately. To do this, pass the storage
connection string to the client's from_connection_string
class method:
from azure.storage.blob import BlobServiceClient
connection_string = "DefaultEndpointsProtocol=https;AccountName=xxxx;AccountKey=xxxx;EndpointSuffix=core.windows.net"
service = BlobServiceClient.from_connection_string(conn_str=connection_string)
The connection string to your storage account can be found in the Azure Portal under the "Access Keys" section or by running the following CLI command:
az storage account show-connection-string -g MyResourceGroup -n MyStorageAccount
Key concepts
The following components make up the Azure Blob Service:
- The storage account itself
- A container within the storage account
- A blob within a container
The Azure Storage Blobs client library for Python allows you to interact with each of these components through the use of a dedicated client object.
Clients
Four different clients are provided to interact with the various components of the Blob Service:
- BlobServiceClient -
this client represents interaction with the Azure storage account itself, and allows you to acquire preconfigured
client instances to access the containers and blobs within. It provides operations to retrieve and configure the
account properties as well as list, create, and delete containers within the account. To perform operations on a
specific container or blob, retrieve a client using the
get_container_client
orget_blob_client
methods. - ContainerClient -
this client represents interaction with a specific container (which need not exist yet), and allows you to acquire
preconfigured client instances to access the blobs within. It provides operations to create, delete, or configure a
container and includes operations to list, upload, and delete the blobs within it. To perform operations on a
specific blob within the container, retrieve a client using the
get_blob_client
method. - BlobClient - this client represents interaction with a specific blob (which need not exist yet). It provides operations to upload, download, delete, and create snapshots of a blob, as well as specific operations per blob type.
- BlobLeaseClient -
this client represents lease interactions with a
ContainerClient
orBlobClient
. It provides operations to acquire, renew, release, change, and break a lease on a specified resource.
Async Clients
This library includes a complete async API supported on Python 3.5+. To use it, you must first install an async transport, such as aiohttp. See azure-core documentation for more information.
Async clients and credentials should be closed when they're no longer needed. These
objects are async context managers and define async close
methods.
Blob Types
Once you've initialized a Client, you can choose from the different types of blobs:
- Block blobs store text and binary data, up to approximately 4.75 TiB. Block blobs are made up of blocks of data that can be managed individually
- Append blobs are made up of blocks like block blobs, but are optimized for append operations. Append blobs are ideal for scenarios such as logging data from virtual machines
- Page blobs store random access files up to 8 TiB in size. Page blobs store virtual hard drive (VHD) files and serve as disks for Azure virtual machines
Examples
The following sections provide several code snippets covering some of the most common Storage Blob tasks, including:
Note that a container must be created before to upload or download a blob.
Create a container
Create a container from where you can upload or download blobs.
from azure.storage.blob import ContainerClient
container_client = ContainerClient.from_connection_string(conn_str="<connection_string>", container_name="mycontainer")
container_client.create_container()
Use the async client to create a container
from azure.storage.blob.aio import ContainerClient
container_client = ContainerClient.from_connection_string(conn_str="<connection_string>", container_name="mycontainer")
await container_client.create_container()
Uploading a blob
Upload a blob to your container
from azure.storage.blob import BlobClient
blob = BlobClient.from_connection_string(conn_str="<connection_string>", container_name="mycontainer", blob_name="my_blob")
with open("./SampleSource.txt", "rb") as data:
blob.upload_blob(data)
Use the async client to upload a blob
from azure.storage.blob.aio import BlobClient
blob = BlobClient.from_connection_string(conn_str="<connection_string>", container_name="mycontainer", blob_name="my_blob")
with open("./SampleSource.txt", "rb") as data:
await blob.upload_blob(data)
Downloading a blob
Download a blob from your container
from azure.storage.blob import BlobClient
blob = BlobClient.from_connection_string(conn_str="<connection_string>", container_name="mycontainer", blob_name="my_blob")
with open("./BlockDestination.txt", "wb") as my_blob:
blob_data = blob.download_blob()
blob_data.readinto(my_blob)
Download a blob asynchronously
from azure.storage.blob.aio import BlobClient
blob = BlobClient.from_connection_string(conn_str="<connection_string>", container_name="mycontainer", blob_name="my_blob")
with open("./BlockDestination.txt", "wb") as my_blob:
stream = await blob.download_blob()
data = await stream.readall()
my_blob.write(data)
Enumerating blobs
List the blobs in your container
from azure.storage.blob import ContainerClient
container = ContainerClient.from_connection_string(conn_str="<connection_string>", container_name="mycontainer")
blob_list = container.list_blobs()
for blob in blob_list:
print(blob.name + '\n')
List the blobs asynchronously
from azure.storage.blob.aio import ContainerClient
container = ContainerClient.from_connection_string(conn_str="<connection_string>", container_name="mycontainer")
blob_list = []
async for blob in container.list_blobs():
blob_list.append(blob)
print(blob_list)
Optional Configuration
Optional keyword arguments that can be passed in at the client and per-operation level.
Retry Policy configuration
Use the following keyword arguments when instantiating a client to configure the retry policy:
- retry_total (int): Total number of retries to allow. Takes precedence over other counts.
Pass in
retry_total=0
if you do not want to retry on requests. Defaults to 10. - retry_connect (int): How many connection-related errors to retry on. Defaults to 3.
- retry_read (int): How many times to retry on read errors. Defaults to 3.
- retry_status (int): How many times to retry on bad status codes. Defaults to 3.
- retry_to_secondary (bool): Whether the request should be retried to secondary, if able.
This should only be enabled of RA-GRS accounts are used and potentially stale data can be handled.
Defaults to
False
.
Encryption configuration
Use the following keyword arguments when instantiating a client to configure encryption:
- require_encryption (bool): If set to True, will enforce that objects are encrypted and decrypt them.
- encryption_version (str): Specifies the version of encryption to use. Current options are
'2.0'
or'1.0'
and the default value is'1.0'
. Version 1.0 is deprecated, and it is highly recommended to use version 2.0. - key_encryption_key (object): The user-provided key-encryption-key. The instance must implement the following methods:
wrap_key(key)
--wraps the specified key using an algorithm of the user's choice.get_key_wrap_algorithm()
--returns the algorithm used to wrap the specified symmetric key.get_kid()
--returns a string key id for this key-encryption-key.
- key_resolver_function (callable): The user-provided key resolver. Uses the kid string to return a key-encryption-key implementing the interface defined above.
Other client / per-operation configuration
Other optional configuration keyword arguments that can be specified on the client or per-operation.
Client keyword arguments:
- connection_timeout (int): The number of seconds the client will wait to establish a connection to the server. Defaults to 20 seconds.
- read_timeout (int): The number of seconds the client will wait, between consecutive read operations, for a response from the server. This is a socket level timeout and is not affected by overall data size. Client-side read timeouts will be automatically retried. Defaults to 60 seconds.
- transport (Any): User-provided transport to send the HTTP request.
Per-operation keyword arguments:
- raw_response_hook (callable): The given callback uses the response returned from the service.
- raw_request_hook (callable): The given callback uses the request before being sent to service.
- client_request_id (str): Optional user specified identification of the request.
- user_agent (str): Appends the custom value to the user-agent header to be sent with the request.
- logging_enable (bool): Enables logging at the DEBUG level. Defaults to False. Can also be passed in at the client level to enable it for all requests.
- logging_body (bool): Enables logging the request and response body. Defaults to False. Can also be passed in at the client level to enable it for all requests.
- headers (dict): Pass in custom headers as key, value pairs. E.g.
headers={'CustomValue': value}
Troubleshooting
General
Storage Blob clients raise exceptions defined in Azure Core.
This list can be used for reference to catch thrown exceptions. To get the specific error code of the exception, use the error_code
attribute, i.e, exception.error_code
.
Logging
This library uses the standard logging library for logging. Basic information about HTTP sessions (URLs, headers, etc.) is logged at INFO level.
Detailed DEBUG level logging, including request/response bodies and unredacted
headers, can be enabled on a client with the logging_enable
argument:
import sys
import logging
from azure.storage.blob import BlobServiceClient
# Create a logger for the 'azure.storage.blob' SDK
logger = logging.getLogger('azure.storage.blob')
logger.setLevel(logging.DEBUG)
# Configure a console output
handler = logging.StreamHandler(stream=sys.stdout)
logger.addHandler(handler)
# This client will log detailed information about its HTTP sessions, at DEBUG level
service_client = BlobServiceClient.from_connection_string("your_connection_string", logging_enable=True)
Similarly, logging_enable
can enable detailed logging for a single operation,
even when it isn't enabled for the client:
service_client.get_service_stats(logging_enable=True)
Next steps
More sample code
Get started with our Blob samples.
Several Storage Blobs Python SDK samples are available to you in the SDK's GitHub repository. These samples provide example code for additional scenarios commonly encountered while working with Storage Blobs:
-
blob_samples_container_access_policy.py (async version) - Examples to set Access policies:
- Set up Access Policy for container
-
blob_samples_hello_world.py (async version) - Examples for common Storage Blob tasks:
- Set up a container
- Create a block, page, or append blob
- Upload blobs
- Download blobs
- Delete blobs
-
blob_samples_authentication.py (async version) - Examples for authenticating and creating the client:
- From a connection string
- From a shared access key
- From a shared access signature token
- From active directory
-
blob_samples_service.py (async version) - Examples for interacting with the blob service:
- Get account information
- Get and set service properties
- Get service statistics
- Create, list, and delete containers
- Get the Blob or Container client
-
blob_samples_containers.py (async version) - Examples for interacting with containers:
- Create a container and delete containers
- Set metadata on containers
- Get container properties
- Acquire a lease on container
- Set an access policy on a container
- Upload, list, delete blobs in container
- Get the blob client to interact with a specific blob
-
blob_samples_common.py (async version) - Examples common to all types of blobs:
- Create a snapshot
- Delete a blob snapshot
- Soft delete a blob
- Undelete a blob
- Acquire a lease on a blob
- Copy a blob from a URL
-
blob_samples_directory_interface.py - Examples for interfacing with Blob storage as if it were a directory on a filesystem:
- Copy (upload or download) a single file or directory
- List files or directories at a single level or recursively
- Delete a single file or recursively delete a directory
Additional documentation
For more extensive documentation on Azure Blob storage, see the Azure Blob storage documentation on docs.microsoft.com.
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for azure-storage-blob-12.20.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | eeb91256e41d4b5b9bad6a87fd0a8ade07dd58aa52344e2c8d2746e27a017d3b |
|
MD5 | 554747ef02783a23f70a67ec588fcdfe |
|
BLAKE2b-256 | 1b0f86cdaec4be486d12fd5bd2c56e835492a58d3bcd4915d24473e889b70f2c |
Hashes for azure_storage_blob-12.20.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | de6b3bf3a90e9341a6bcb96a2ebe981dffff993e9045818f6549afea827a52a9 |
|
MD5 | 7d7772b80e3c59385fa81e2f59292abd |
|
BLAKE2b-256 | 15192be26569e708cb618feecd7316ee0c5475273f7cdb4f9030a862870c74c8 |