Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description


CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store


The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python


The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data


As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")


The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • if no other credential can be created, managed identity will be tried
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file


The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.


The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)



Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.10.tar.gz (36.8 kB view hashes)

Uploaded source

Built Distributions

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page