Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description

object-store-python

CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store

Installation

The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python

Usage

The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data

Configuration

As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"
}

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")

Azure

The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • if no other credential can be created, managed identity will be tried
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file

S3

The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.

GCS

The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)

Development

Prerequisites

Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.4.tar.gz (34.1 kB view details)

Uploaded Source

Built Distributions

object_store_python-0.1.4-cp38-abi3-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.8+ Windows x86-64

object_store_python-0.1.4-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

object_store_python-0.1.4-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

object_store_python-0.1.4-cp38-abi3-macosx_11_0_arm64.whl (4.2 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

object_store_python-0.1.4-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (8.7 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

object_store_python-0.1.4-cp38-abi3-macosx_10_7_x86_64.whl (4.5 MB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file object_store_python-0.1.4.tar.gz.

File metadata

File hashes

Hashes for object_store_python-0.1.4.tar.gz
Algorithm Hash digest
SHA256 ba34253830eb1d5313adf3fa7f13e2015a71ebe770b523fbf4b4aeb994013599
MD5 3d52f684095ec288d6d40ffd8643abdc
BLAKE2b-256 25969318cb96683180198470892190cf05e78706241b9bac90a48fd81dd6b67f

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.4-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.4-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 c07320ba6158f3fa045e8d24e062b4813b29bc23ea1c37621328736c651ceaa7
MD5 1ea9a6b5f9033880939d8346d4f9f255
BLAKE2b-256 51170c9f7ebb9746a9a9585e4e360269e143fd08c8ff152541637599dcd341f2

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.4-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.4-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c1ed25407fd7c147f9d370bb11e98dd4629afcdfedbc7dae622fe18a4300e173
MD5 30e80623aa651efbc77ade21e7231c4c
BLAKE2b-256 99406e3b1afdea8aab15ce351769821ab1046e34f23e96304bb41206580bede7

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.4-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.4-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8d4d3b74141457a3b7922976fd0fa87030fe71c6a14e48adbc5de5b35720c74d
MD5 754c02b3037f5764aedf5e36e6f128a5
BLAKE2b-256 6fb35ec0082af201f6c5b3be0fd4763b0dcef0c48e507026674c1c0d71ff18f7

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.4-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.4-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 281143376206e96a6f010db8cc5e9fd072ee8123db15a33f3a9bdb0cde2f0448
MD5 bf89769c745b756c3cca683cc840c339
BLAKE2b-256 a3c1c916e6c391193113422c0c5a4e3cac863149cb29cd0cc2ea54ca089df81a

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.4-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.4-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 740345ab12d45f885c17122256df9e23cc8094720a67ec5ba80c396c0e05fdc5
MD5 785dcfc3d31bac8d719ea89712445915
BLAKE2b-256 415c0859532e5794637d2b8861e0664fe99cc4dcc2a098f595ee051a4d800a71

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.4-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.4-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 5606ead19e56e1317681f469143bc52937a8389e7040bdaec49f14a4729554ab
MD5 178ed9c341507a1005c34ad105f55df2
BLAKE2b-256 5b5afc2a055e4ee04775803507f154b2389d0cf1c2bad5aa42972c4983a3d646

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page