Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description

object-store-python

CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store

Installation

The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python

Usage

The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data

Configuration

As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"
}

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")

Azure

The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • if no other credential can be created, managed identity will be tried
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file

S3

The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.

GCS

The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)

Development

Prerequisites

Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.5.tar.gz (34.3 kB view details)

Uploaded Source

Built Distributions

object_store_python-0.1.5-cp38-abi3-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.8+ Windows x86-64

object_store_python-0.1.5-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.8 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

object_store_python-0.1.5-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.3 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

object_store_python-0.1.5-cp38-abi3-macosx_11_0_arm64.whl (4.2 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

object_store_python-0.1.5-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (8.7 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

object_store_python-0.1.5-cp38-abi3-macosx_10_7_x86_64.whl (4.5 MB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file object_store_python-0.1.5.tar.gz.

File metadata

File hashes

Hashes for object_store_python-0.1.5.tar.gz
Algorithm Hash digest
SHA256 5cd960252441aa85762e35f9da981c3bdcdb8544051a20803376680c7e3830da
MD5 8666c268e4c9f7d3b9ca8c20034dba46
BLAKE2b-256 07d818fffe8993b86cc0d6fe92c83a05656c10042af76dbc8fb1c6e7d6d1ea0e

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.5-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.5-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 c4c9709ae61a58f5559921aa612f327d393ffd98e72deab86dda20b8ccb9c5d9
MD5 c5da20cfb696d04219bed463b785273c
BLAKE2b-256 a58d00ee7639d7091e2804c588ef4718d26a3e9e90f5c4799ae5bf2d165a01a3

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.5-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.5-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 633dc23306fc37cd8abfb67019781b7ed3f29f174677fbac170cac627a635b7f
MD5 26ac27a7d9f6ef7d9ea78a63f7fd9256
BLAKE2b-256 a4ec392f44f938e75b7aac52310dc6d83340c96a04b4d2aea53af495630a08e5

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.5-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.5-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 23063733477cf00df25a88ef0927d009db52601e1b7b77cf62315a1d3b0c5b90
MD5 27d93767586f49e83a727d9324ea6ca5
BLAKE2b-256 6d1afd363e5e8080cc3d8b27eb96cf6b36c6a40a6024ff2123bf7eefd7360e6d

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.5-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.5-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 da37d304c976ab521e6eba67795e542729b0337fc5802ea7afe99a464a3d9003
MD5 ed9a98a90397bc44c5dd0d1769d5013a
BLAKE2b-256 3ddba1d2e310f38dd9a5507308a8798e48476408c801a4185dbf6405e0c1a11c

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.5-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.5-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 af9eeb55e972a3ecce9095c717c7b7907febda46d791201d4ba82502a509662b
MD5 acdd0710e2f0f382fb2e92e992c31218
BLAKE2b-256 39999ed67ff05f2937c6bf45c839ad9c84b1b05491685bd418829c14cd14da44

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.5-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.5-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 bfa7eb7edce8545b3784835f1af49ecb29d1d44771d2406ba8b52d60328849a8
MD5 3fcd18fad4b6f3e4639c38ac98add117
BLAKE2b-256 840e42f03631aaae3b1f6337918638a1af4ae47054582913f8a0fbea4ab49934

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page