Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description

object-store-python

CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store

Installation

The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python

Usage

The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data

Configuration

As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"
}

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")

Azure

The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • use_managed_identity
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file

S3

The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.

GCS

The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)

Development

Prerequisites

Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.1.tar.gz (34.0 kB view details)

Uploaded Source

Built Distributions

object_store_python-0.1.1-cp38-abi3-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.8+ Windows x86-64

object_store_python-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

object_store_python-0.1.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.2 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

object_store_python-0.1.1-cp38-abi3-macosx_11_0_arm64.whl (4.1 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

object_store_python-0.1.1-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (8.5 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

object_store_python-0.1.1-cp38-abi3-macosx_10_7_x86_64.whl (4.4 MB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file object_store_python-0.1.1.tar.gz.

File metadata

File hashes

Hashes for object_store_python-0.1.1.tar.gz
Algorithm Hash digest
SHA256 cf6ba71283f54aa0ae16eb3bed38edf6e4447461ea4f69a467ad1872e99466a3
MD5 20e0b2cc8639a0993d732a86d4b8aed9
BLAKE2b-256 8ad9a51983ffd4dd099ef2513ac1c94de1b3dff26c234349e2e69fe5353ae8fe

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.1-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.1-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 1a3d86997fa2512e293821b46f0267a6556c505bd10d41591c5c31f5ea51c0bd
MD5 d3afce71fd05e93b8d605cbf1086bd97
BLAKE2b-256 14fa2d5243340d64dde409d23e062dfd7c701fda704dc80ffb8ac44e69c1db9e

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a658d54a2b7f5b68fa15cbc7ac662096a8cb80ad2cb363c53465c5e2cccf797c
MD5 ebeaab700357e67a39fed2cb2d5444af
BLAKE2b-256 e2fcfd2e14a9730a34a805975a843031e6d53c589a449ab5e3889ae96df7a01a

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 efcd0c39adfd3816e2097a9d814f06993ae7f75e7064b97142fb1a9b954048a0
MD5 d06fb71491a349eea50f7b0b096a719f
BLAKE2b-256 29307e827949c5017449d7e15a5d233cac485902c897d77d118cd4250e6f3a20

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.1-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.1-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6a83368c0c0d6f69b42c1e4181e2fefdedae40f015c3a785e8e30305ce516f1c
MD5 a062a02d6883177b6b6565b5fd9e1e8f
BLAKE2b-256 e81e71b278917c4fb9512080dc3cca36801595c5c704adf27dbd00c48f4bdfbb

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.1-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.1-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 12c043ff23e493c272ada6ae5246684aae214b1166b39b21f9bf9a95961eb732
MD5 9f57c5e44b0701294c377ff5174d07d6
BLAKE2b-256 fdbb818aea1e68964201262b478a8ef73abda0a2ec5268e4214a10819585c4c8

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.1-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.1-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 60e54c73395bb5443e5c0003447522997149eabea2aa00750c3d35a7c3cc2d0f
MD5 cfaae441a3f139995a5893d8e5f3913f
BLAKE2b-256 42761523a82a6e2b9c3b7ba0b7d0a5aa55b51ff2959d2852517f131617e29580

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page