Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description

object-store-python

CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store

Installation

The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python

Usage

The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data

Configuration

As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"
}

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")

Azure

The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • if no other credential can be created, managed identity will be tried
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file

S3

The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.

GCS

The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)

Development

Prerequisites

Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.8.tar.gz (34.3 kB view details)

Uploaded Source

Built Distributions

object_store_python-0.1.8-cp38-abi3-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.8+ Windows x86-64

object_store_python-0.1.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

object_store_python-0.1.8-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

object_store_python-0.1.8-cp38-abi3-macosx_11_0_arm64.whl (4.2 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

object_store_python-0.1.8-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (8.7 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

object_store_python-0.1.8-cp38-abi3-macosx_10_7_x86_64.whl (4.5 MB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file object_store_python-0.1.8.tar.gz.

File metadata

File hashes

Hashes for object_store_python-0.1.8.tar.gz
Algorithm Hash digest
SHA256 b1482fc7e57d9d7039299eca289d214d6528d68e76450565a20cc6f4777b76aa
MD5 3823a560a1b0242af8f0b741e1073d4d
BLAKE2b-256 410c2c97bd4b0560682f33e05cbe66514e4d7f8f48e5d520cdd96cee489da94d

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.8-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.8-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 64eb70131a0db7b0c0e4723a32433a2603c0f44ebc1e8c289f8064876b00368e
MD5 02fe8f03183ca93702e2e108cf4be6fa
BLAKE2b-256 af02837bce9bb2001f3627ca8c4745b4b3b8758bb6c0c97f1b50450bb6309d0f

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9268e7be9aed1533b7859e2a771c88edb9e7e84a611e36401afdb91a00663683
MD5 ccd52e7ff927274f7d7c4fd3e9ebabd7
BLAKE2b-256 05aea27844dcd456ac0cf2b89c6719da2da05f92cdd8bf524bb080c0d884dbba

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.8-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.8-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ef40373aada4ef9c7fc8acc8005904ab2a5cc6cf4734a949ba862a7aae65ead0
MD5 849f8d0a9a5b4c03881b14d64aa63d9f
BLAKE2b-256 018601b42c0ae0aec868681a33914837d1dee3ebc9401d68353bd76e1071817f

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.8-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.8-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4b51bd8c7325fc5d7a7bba5787138046367f2773ce46280918cd5927dfd7e546
MD5 8d41efcecf8180b961491ac7895cf7d6
BLAKE2b-256 78b7d257b9c6e424b1e3065b8d54c2fcc713ade918e552463678e109fbd98895

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.8-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.8-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 6ca1bd627734c756ea54fdd985aa2bb14479d183efa4a9c66b3032b02efb509c
MD5 6bc477ff849fa5575460710681bc100a
BLAKE2b-256 9785a7304dfec84317259c36cdc54868f8290bc9223581d5ed5c5903d3705dbb

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.8-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.8-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 c14f0e50e1cfb42f078a42d502fc8e93c8b357da94a4e8de2fa92720b4fea461
MD5 9d2d222cc34eff256cdc463fc005e42b
BLAKE2b-256 43a3515da6b02b747971a9cf0af9b7ff89d025e061740e9b074585563daea1f1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page