Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description

object-store-python

CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store

Installation

The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python

Usage

The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data

Configuration

As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"
}

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")

Azure

The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • if no other credential can be created, managed identity will be tried
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file

S3

The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.

GCS

The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)

Development

Prerequisites

Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.9.tar.gz (34.3 kB view details)

Uploaded Source

Built Distributions

object_store_python-0.1.9-cp38-abi3-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.8+ Windows x86-64

object_store_python-0.1.9-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

object_store_python-0.1.9-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

object_store_python-0.1.9-cp38-abi3-macosx_11_0_arm64.whl (4.2 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

object_store_python-0.1.9-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (8.7 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

object_store_python-0.1.9-cp38-abi3-macosx_10_7_x86_64.whl (4.5 MB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file object_store_python-0.1.9.tar.gz.

File metadata

File hashes

Hashes for object_store_python-0.1.9.tar.gz
Algorithm Hash digest
SHA256 de7bdfc59c4da1608351cf8121f6bc8a26db11a635cef223aed40c31310131f9
MD5 6a8c6c233c41fa120d2ca66d7ec22722
BLAKE2b-256 80fa1340abbe034449419a451ef8b3834d796eef95e0f3d61aa6585c7b33a0b3

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.9-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.9-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 406be24c0d9f6d5a46c16f6f2e58f0fff64802834d934cbfd207377d7d8d02b8
MD5 6a7bd48765a53f6e31d18fef5f9b1e4a
BLAKE2b-256 0f87aaee562844466b88075d30cc96bef86f7865a03bb972f3b84d302f8556b5

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.9-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.9-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2d89c3bfdd8a3ea1f514a1d4a11de38efc18bef55940ec6c43be726868b616be
MD5 5938f11f1c5364257aacc43c5c4d3c5e
BLAKE2b-256 3063e66afe76799626b2d60a9bed7613bfae2602b70e82eeab9bdbab7e9e8d9b

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.9-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.9-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 d89a91c473fc041074049ceae6097cd16b4bca46948015724ece37a594ca2bc1
MD5 caa60ce5fad67e85b61a909366d46ac8
BLAKE2b-256 f7a516c3441f81c2a36444ea601015e3fb598bc12bfcb0ab098657be0d41c00a

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.9-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.9-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5b78c8b563e41afa6cabfe8caa0a1bddd769f8b49a23fd8afd919466c389adfd
MD5 652b0c9df2384f3cfbaa398ddab153e4
BLAKE2b-256 40e70023af621ef33cf13a00e196006ba09877e0d1df6fb4c039dd0295f4e7c3

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.9-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.9-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 47c706999bcfdcc703db57ce5aa677445d5569ba3d9fcc5f8804e536bcb5a8d3
MD5 eaeabaf8f7de0d5c6c60035200e13439
BLAKE2b-256 36886ef0d0d1dc9b13f313280b06b4535e273659f62b01830e517c50d89b3224

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.9-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.9-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 866062c3a4d078754146689d0ec00731b33aac4f4e1319377731ca275f28a96e
MD5 f5eab42b25f2668ad4206316e6f56bd9
BLAKE2b-256 ba27d4ef5121743e1b5972fc1d6f0735bfa7ecbd21ed422419b18f297521bc83

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page