Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description

object-store-python

CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store

Installation

The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python

Usage

The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data

Configuration

As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"
}

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")

Azure

The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • if no other credential can be created, managed identity will be tried
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file

S3

The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.

GCS

The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)

Development

Prerequisites

Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.6.tar.gz (34.3 kB view details)

Uploaded Source

Built Distributions

object_store_python-0.1.6-cp38-abi3-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.8+ Windows x86-64

object_store_python-0.1.6-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

object_store_python-0.1.6-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

object_store_python-0.1.6-cp38-abi3-macosx_11_0_arm64.whl (4.2 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

object_store_python-0.1.6-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (8.7 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

object_store_python-0.1.6-cp38-abi3-macosx_10_7_x86_64.whl (4.5 MB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file object_store_python-0.1.6.tar.gz.

File metadata

File hashes

Hashes for object_store_python-0.1.6.tar.gz
Algorithm Hash digest
SHA256 84e5f40eec10863b9efa4e669d2e2e435c8fd8a2a6a392a7ababfbf48bed5623
MD5 4f82b397de6bfe7f05bd94344ed6594c
BLAKE2b-256 9023a8c8679e210a689f8189a3a1adb54dbc2e1d96b01d589bf7efcf038c2bfa

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.6-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.6-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 2c262b03760de293b98e13690e68c01d676511ddcca877e36eb4eb88d6b39d7a
MD5 cf57bce12d7edca3b1002a826d15511a
BLAKE2b-256 808ad53a98db04a47b5b491efa9df6425a28caa30a0a5bd71f573c10374f8f72

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.6-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.6-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8e80ca2838df764348a19e5df210cb2f705d90071ec06688c4d64a2290117c38
MD5 175fb03899120ec4de9df0458c46b650
BLAKE2b-256 df44287af6a8db9b3e7dcbf6485ecdb27b6822aeefe802a7621b0ee92f425458

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.6-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.6-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3ecd39b881032b42b096106395fa371f8b3ac5daa30a0d7c7741832e210f776e
MD5 c9a05028b246c5c13cb949fe57142ee8
BLAKE2b-256 2390d3ad620dcece9dd73662cb72cbf302dae658031ae5384e587d121aef63bb

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.6-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.6-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 929f25d53cb286580e10225d93c69807bb6ab709e5195a03c0d149fdd4edfeef
MD5 0a9411a97aa847ce8312816e22335b91
BLAKE2b-256 25dbf06f68b25dfacb8a33bd5c9807e41e9c3d28a4ec042e1462e203023f3a6b

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.6-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.6-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 bd568a24cf2ff35fa641c98b803725583717ef7ebd4ee1a1a730c890e6d4c021
MD5 e5451c476811991bc73df6517874e4f0
BLAKE2b-256 815d956ba8964707e4ccc97ba85f38ef8e8343e3e8edb61dd651f2c4cda161bc

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.6-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.6-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 967b437b594ec114932644bdba0a4f95c030aa7f04a83ece4c0ff7a2fc9bc4fd
MD5 c87297240ac15a0326618ba61f2d709e
BLAKE2b-256 281be9ed4ee17273f5516fb0c84b44c1ccff7c6a734d86c194c7d0d0c81c9d2f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page