Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description

object-store-python

CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store

Installation

The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python

Usage

The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data

Configuration

As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"
}

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")

Azure

The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • if no other credential can be created, managed identity will be tried
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file

S3

The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.

GCS

The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)

Development

Prerequisites

Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.10.tar.gz (36.8 kB view details)

Uploaded Source

Built Distributions

object_store_python-0.1.10-cp38-abi3-win_amd64.whl (4.5 MB view details)

Uploaded CPython 3.8+ Windows x86-64

object_store_python-0.1.10-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.9 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

object_store_python-0.1.10-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.4 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

object_store_python-0.1.10-cp38-abi3-macosx_11_0_arm64.whl (4.2 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

object_store_python-0.1.10-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (8.8 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

object_store_python-0.1.10-cp38-abi3-macosx_10_7_x86_64.whl (4.6 MB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file object_store_python-0.1.10.tar.gz.

File metadata

File hashes

Hashes for object_store_python-0.1.10.tar.gz
Algorithm Hash digest
SHA256 451b22f89d15c5558c6b7ddbf9e773f47295931342a801ed1ff9f11d74857dc2
MD5 bd591649f36205ca995c2556dc7c8f88
BLAKE2b-256 0dbbb372b4b36926487b3b2ca6efb12261d0eaf0828ba1db0bba49f1e66b9862

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.10-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.10-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 1758cb6348f5a1cbcaa3400c0f91f2affee35f567d4bae9c590e711f74d9b18f
MD5 16a570933119d840ed5c3b81c56470cd
BLAKE2b-256 373ef710554442df82eccce5c6e11c1c9a073b80cf5b25bd79ed72063a503da3

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.10-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.10-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7cbd3ec5172eb1a234d9ec9901d8298bbd84db486d2620a73249eb1d3f8586e3
MD5 b008f91b41212ecb700f7cd27808c92d
BLAKE2b-256 57bcd44f9f9e651fbf4a048a345ab20bd860b946f47c5d0cca30cb8af622f3bc

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.10-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.10-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3ca4d1b85e99af1fb476f6a7d4dae0d11a86a5df36f8d00d1b4a10362ad444cd
MD5 61cbf455c60a1cbeee0f0c595892a2bd
BLAKE2b-256 77cd4da4d5d16bc1c99d21e0f94a21400c8ba1f1fbf4cfba10b086b070e5c1ee

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.10-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.10-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bf37ba9f9c79a5a2385f72109fb89c1155a6ffde1338fb6e8fb8d029de5b0b95
MD5 2cad7d9bce44ab0843939109783db5a8
BLAKE2b-256 24ed854395abe96bfea30a92ae2eff77b6f446257f1e4eba65be5ef71a345c2a

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.10-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.10-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 33f1f2fd11bcf2c1d6936dd2a09f538530c99b8735f9be5d2f4bf56b83f2f386
MD5 c0dbae44cd119f946fe4e1263843f7d3
BLAKE2b-256 057a2b4d984f794bfd98b3be0e252ef22682e85f00e96be292d6c6bfb6fc600d

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.10-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.10-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 e84f719d06773fdef17e82d7d8ad48aee2a82a04b97b4c50f7a2f5c6da4223a3
MD5 e8d444804b355781cd35d8f0e87eadb7
BLAKE2b-256 2ee3e6a2ca93484ea190d1a0f5c866af3b5415f7594ac4c9adc7d7a6b1534f1a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page