Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description

object-store-python

CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store

Installation

The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python

Usage

The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data

Configuration

As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"
}

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")

Azure

The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • if no other credential can be created, managed identity will be tried
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file

S3

The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.

GCS

The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)

Development

Prerequisites

Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.3.tar.gz (34.0 kB view details)

Uploaded Source

Built Distributions

object_store_python-0.1.3-cp38-abi3-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.8+ Windows x86-64

object_store_python-0.1.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

object_store_python-0.1.3-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.2 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

object_store_python-0.1.3-cp38-abi3-macosx_11_0_arm64.whl (4.1 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

object_store_python-0.1.3-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (8.4 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

object_store_python-0.1.3-cp38-abi3-macosx_10_7_x86_64.whl (4.4 MB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file object_store_python-0.1.3.tar.gz.

File metadata

File hashes

Hashes for object_store_python-0.1.3.tar.gz
Algorithm Hash digest
SHA256 fc201375026559bf26ba584815cbe19b16ef8a4cfc39b22379610205d3add7c3
MD5 14931f68bdb3f0fe79ed0e9f2ca907bc
BLAKE2b-256 86070b919227f8677b46a8ddde97f5f67a760b0192afe8162a18e1a7d95e042c

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.3-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.3-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 f59ebe4abb72500e6caa2c43b22084be764f38b1f06285a5cc431fd1f616a0a6
MD5 7b9669708a94613d6fa1f90d12e81d89
BLAKE2b-256 1d1cc4f945bc09f4c271d619f1e7e509ca74c6d81170601cb6ca2c382a896cf1

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3e05915baecfc0064c3576a55159c48ee3dd88ce51e62fffb92809a594a7d44e
MD5 a4433c4b2e8ea39fd2713fb75e334cdb
BLAKE2b-256 3b6ee1f225730b66d85bd21a92b46dbadc37322c4371b2f8d40cac0609605e07

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.3-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.3-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e9f6b91248c026d8f30b01b24647e12f48d6f50bd341e35d260bf06b6c4e70e9
MD5 d06ae57dfec03d08883283c2fe19dcf6
BLAKE2b-256 ae7d2209367eadeb71e6db94b94c9ee8c340d0397be0a30a3071dfaf0b75fc85

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.3-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.3-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 764f469b13c067d34c32fe007bd242004b648877a879537276dacfcf5b27a91b
MD5 4f5504a31ef5cbad1a778cebd1ea3539
BLAKE2b-256 559710f7bf7d6166479f67e6612ba11c7df4f220f80ad79f910a0e3c330c5b9f

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.3-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.3-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 419d6d97d7e5eaa15a97daf11c7eb2fcdfd560bef382a0468282ee3d2a414194
MD5 e903e9fdc79f470175a12a455a4f7082
BLAKE2b-256 156b3d9815b280f79bd858a8a8dd2a773501b36a34ff14d9441faec74bec372a

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.3-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.3-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 93d684948bff5f3a91142e085c14390ddc001600a509da966add7a786eb96725
MD5 752ea9f5f3d025fef17d82fa381b9c92
BLAKE2b-256 e0fa40c1400f65a4762b7555db3d1532961c14e01c048bf5748b1a5c6a2a15ed

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page