Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description

object-store-python

CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store

Installation

The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python

Usage

The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data

Configuration

As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"
}

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")

Azure

The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • if no other credential can be created, managed identity will be tried
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file

S3

The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.

GCS

The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)

Development

Prerequisites

Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.7.tar.gz (34.3 kB view details)

Uploaded Source

Built Distributions

object_store_python-0.1.7-cp38-abi3-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.8+ Windows x86-64

object_store_python-0.1.7-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

object_store_python-0.1.7-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

object_store_python-0.1.7-cp38-abi3-macosx_11_0_arm64.whl (4.2 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

object_store_python-0.1.7-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (8.7 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

object_store_python-0.1.7-cp38-abi3-macosx_10_7_x86_64.whl (4.5 MB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file object_store_python-0.1.7.tar.gz.

File metadata

File hashes

Hashes for object_store_python-0.1.7.tar.gz
Algorithm Hash digest
SHA256 e86ae26711b55e76f04cab32076dd296eea226cdec50ec9040ac28105036b0b3
MD5 8f2b72ffea2c5b34e09414d057fb53ed
BLAKE2b-256 eb40b75ce4a3b46ced3e42b5d2f9886cd27d9c95c67ec12fa4714bfd70ac19de

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.7-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.7-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 6e9ea24bb1ce1150b185a0f2afaac13729ba75b6b5e87a2a8d4db7e081d1c0af
MD5 6b66bbcf8b2712210eaf20bd37f698f7
BLAKE2b-256 75e338526434aa20f8fa527436390402c5d5680a11ed767167a2dbb7acb50f83

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.7-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.7-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f8b03e7ad08a1f44404aed244e20ae881575038e4cf7ba2a5577a6144264b6ee
MD5 37aefbe6f57bc88ece777a71edad7787
BLAKE2b-256 2467463fe79ff1dfc77d3124ad2324351e700421ddf4ae3bd3e8608f840792c1

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.7-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.7-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 00fab2ca45ef50129cf16ca091bb8475cc3335f3624d02d31f02cda62433eb4a
MD5 cafcb46680b2afd414883815ee8fccec
BLAKE2b-256 1e31924535d60d2ffa9bbd221920f6386ee1e6059372cb81efa4f51eb019ab32

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.7-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.7-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 77fea688bb1d6a8a21ed2998906ff2d280b8cddca1a92530a97bc2dec0947d8c
MD5 63fbce073e679fbc49713476d047b351
BLAKE2b-256 afd90ed459745879bba96b22be55ef5dd42ae80c28670c4d452ba7b39c2d9c91

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.7-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.7-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 03abc5516e0035f6015f3629df56052394e3983bdfeee5eba58a595991d87f31
MD5 bad7ba24bcd54e1d67c43b940a21a7e0
BLAKE2b-256 ea0fa66277cb8f7dff7ac724e55f0936c6f630f3c6807cb7e67a9d6d76c1405b

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.7-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.7-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 dfd9de410dc2dd0ee035bff4516f3c1a58f12edb95a8dc13520a196374bd922c
MD5 8b3686bbd13661bc7b09e7774db41eaf
BLAKE2b-256 3786b4039dcef9150fdfb4078a0fc36a6434dbc082f8517ba2fef18f245ef917

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page