Azure Storage Connector for PyTorch

These details have not been verified by PyPI

Project links

Project description

Azure Storage Connector for PyTorch (`azstoragetorch`) (Preview)

The Azure Storage Connector for PyTorch (azstoragetorch) is a library that provides seamless, performance-optimized integrations between Azure Storage and PyTorch. Use this library to easily access and store data in Azure Storage while using PyTorch. The library currently offers:

Documentation

For detailed documentation on azstoragetorch, we recommend visiting its official documentation. It includes both a user guide and API references for the project. Content in this README is scoped to a high-level overview of the project and its GitHub repository policies.

Backwards compatibility

While the project is major version 0 (i.e., version is 0.x.y), public interfaces are not stable.

Backwards incompatible changes may be introduced between minor version bumps (e.g., upgrading from 0.1.0 to 0.2.0). If backwards compatibility is needed while using the library, we recommend pinning to a minor version of the library (e.g., azstoragetorch~=0.1.0).

Getting started

Prerequisites

Python 3.9 or later installed
Have an Azure subscription and an Azure storage account

Installation

Install the library with pip:

pip install azstoragetorch

Configuration

azstoragetorch should work without any explicit credential configuration.

azstoragetorch interfaces default to DefaultAzureCredential for credentials which automatically retrieves Microsoft Entra ID tokens based on your current environment. For more information on using credentials with azstoragetorch, see the user guide.

Features

This section highlights core features of azstoragetorch. For more details, see the user guide.

Saving and loading PyTorch models (Checkpointing)

PyTorch supports saving and loading trained models (i.e., checkpointing). The core PyTorch interfaces for saving and loading models are torch.save() and torch.load() respectively. Both of these functions accept a file-like object to be written to or read from.

azstoragetorch offers the azstoragetorch.io.BlobIO file-like object class to save and load models directly to and from Azure Blob Storage when using torch.save() and torch.load():

import torch
import torchvision.models  # Install separately: ``pip install torchvision``
from azstoragetorch.io import BlobIO

# Update URL with your own Azure Storage account and container name
CONTAINER_URL = "https://<my-storage-account-name>.blob.core.windows.net/<my-container-name>"

# Model to save. Replace with your own model.
model = torchvision.models.resnet18(weights="DEFAULT")

# Save trained model to Azure Blob Storage. This saves the model weights
# to a blob named "model_weights.pth" in the container specified by CONTAINER_URL.
with BlobIO(f"{CONTAINER_URL}/model_weights.pth", "wb") as f:
    torch.save(model.state_dict(), f)

# Load trained model from Azure Blob Storage.  This loads the model weights
# from the blob named "model_weights.pth" in the container specified by CONTAINER_URL.
with BlobIO(f"{CONTAINER_URL}/model_weights.pth", "rb") as f:
    model.load_state_dict(torch.load(f))

PyTorch Datasets

PyTorch offers the Dataset and DataLoader primitives for loading data samples. azstoragetorch provides implementations for both types of PyTorch datasets, map-style and iterable-style datasets, to load data samples from Azure Blob Storage:

Data samples returned from both datasets map directly one-to-one to blobs in Azure Blob Storage. When instantiating these dataset classes, use one of their class methods:

from_container_url() - Instantiate dataset by listing blobs from an Azure Storage container.
from_blob_urls() - Instantiate dataset from provided blob URLs

from azstoragetorch.datasets import BlobDataset, IterableBlobDataset

# Update URL with your own Azure Storage account and container name
CONTAINER_URL = "https://<my-storage-account-name>.blob.core.windows.net/<my-container-name>"

# Create an iterable-style dataset by listing blobs in the container specified by CONTAINER_URL.
dataset = IterableBlobDataset.from_container_url(CONTAINER_URL)

# Print the first blob in the dataset. Default output is a dictionary with
# the blob URL and the blob data. Use `transform` keyword argument when
# creating dataset to customize output format.
print(next(iter(dataset)))

# List of blob URLs to create dataset from. Update with your own blob names.
blob_urls = [
    f"{CONTAINER_URL}/<blob-name-1>",
    f"{CONTAINER_URL}/<blob-name-2>",
    f"{CONTAINER_URL}/<blob-name-3>",
]

# Create a map-style dataset from the list of blob URLs
blob_list_dataset = BlobDataset.from_blob_urls(blob_urls)

print(blob_list_dataset[0])  # Print the first blob in the dataset

Once instantiated, azstoragetorch datasets can be provided directly to a PyTorch DataLoader for loading samples:

from torch.utils.data import DataLoader

# Create a DataLoader to load data samples from the dataset in batches of 32
dataloader = DataLoader(dataset, batch_size=32)

for batch in dataloader:
    print(batch["url"])  # Prints blob URLs for each 32 sample batch

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Oct 23, 2025

0.1.2

Oct 14, 2025

This version

0.1.1

May 12, 2025

0.1.0

May 1, 2025

0.0.1

Aug 23, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

azstoragetorch-0.1.1.tar.gz (70.8 kB view details)

Uploaded May 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

azstoragetorch-0.1.1-py3-none-any.whl (22.3 kB view details)

Uploaded May 12, 2025 Python 3

File details

Details for the file azstoragetorch-0.1.1.tar.gz.

File metadata

Download URL: azstoragetorch-0.1.1.tar.gz
Upload date: May 12, 2025
Size: 70.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: RestSharp/106.13.0.0

File hashes

Hashes for azstoragetorch-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`ca145ee42f4aa60c297310543bee69431cb244330c1137d06b2661a236b37615`
MD5	`4bf80650d62724a78e127adad4aa5f92`
BLAKE2b-256	`c778f51a3a3a18abd3ce3322af16ac92d2c1bcd38834124fc1d597a95e67cb62`

See more details on using hashes here.

File details

Details for the file azstoragetorch-0.1.1-py3-none-any.whl.

File metadata

Download URL: azstoragetorch-0.1.1-py3-none-any.whl
Upload date: May 12, 2025
Size: 22.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: RestSharp/106.13.0.0

File hashes

Hashes for azstoragetorch-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0ed13477e240eebfe94b593f95527b24f52f7b6586f225108dc6da2609a2999c`
MD5	`7f75154b4517fc3416ce609d71250e5c`
BLAKE2b-256	`0d1399e29f7c369596575fde04674d780c16b1d8654d9cf4343fb113b91e37be`

See more details on using hashes here.

azstoragetorch 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Azure Storage Connector for PyTorch (`azstoragetorch`) (Preview)

Documentation

Backwards compatibility

Getting started

Prerequisites

Installation

Configuration

Features

Saving and loading PyTorch models (Checkpointing)

PyTorch Datasets

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

azstoragetorch 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Azure Storage Connector for PyTorch (azstoragetorch) (Preview)

Documentation

Backwards compatibility

Getting started

Prerequisites

Installation

Configuration

Features

Saving and loading PyTorch models (Checkpointing)

PyTorch Datasets

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Azure Storage Connector for PyTorch (`azstoragetorch`) (Preview)