Skip to main content

A generic object store interface for uniformly interacting with AWS S3, Google Cloud Storage, Azure Storage and local files.

Project description

object-store-python

CI code style: black PyPI PyPI - Downloads

Python bindings and integrations for the excellent object_store crate. The main idea is to provide a common interface to various storage backends including the objects stores from most major cloud providers. The APIs are very focussed and taylored towards modern cloud native applications by hiding away many features (and complexities) encountered in full fledges file systems.

Among the included backend are:

  • Amazon S3 and S3 compliant APIs
  • Google Cloud Storage Buckets
  • Azure Blob Gen1 and Gen2 accounts (including ADLS Gen2)
  • local storage
  • in-memory store

Installation

The object-store-python package is available on PyPI and can be installed via

poetry add object-store-python

or using pip

pip install object-store-python

Usage

The main ObjectStore API mirrors the native object_store implementation, with some slight adjustments for ease of use in python programs.

ObjectStore api

from object_store import ObjectStore, ObjectMeta

# we use an in-memory store for demonstration purposes.
# data will not be persisted and is not shared across store instances
store = ObjectStore("memory://")

store.put("data", b"some data")

data = store.get("data")
assert data == b"some data"

blobs = store.list()

meta: ObjectMeta = store.head("data")

range = store.get_range("data", start=0, length=4)
assert range == b"some"

store.copy("data", "copied")
copied = store.get("copied")
assert copied == data

Configuration

As much as possible we aim to make access to various storage backends dependent only on runtime configuration. The kind of service is always derived from the url used to specifiy the storage location. Some basic configuration can also be derived from the url string, dependent on the chosen url format.

from object_store import ObjectStore

storage_options = {
    "azure_storage_account_name": "<my-account-name>",
    "azure_client_id": "<my-client-id>",
    "azure_client_secret": "<my-client-secret>",
    "azure_tenant_id": "<my-tenant-id>"
}

store = ObjectStore("az://<container-name>", storage_options)

We can provide the same configuration via the environment.

import os
from object_store import ObjectStore

os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "<my-account-name>"
os.environ["AZURE_CLIENT_ID"] = "<my-client-id>"
os.environ["AZURE_CLIENT_SECRET"] = "<my-client-secret>"
os.environ["AZURE_TENANT_ID"] = "<my-tenant-id>"

store = ObjectStore("az://<container-name>")

Azure

The recommended url format is az://<container>/<path> and Azure always requieres azure_storage_account_name to be configured.

  • shared key
    • azure_storage_account_key
  • service principal
    • azure_client_id
    • azure_client_secret
    • azure_tenant_id
  • shared access signature
    • azure_storage_sas_key (as provided by StorageExplorer)
  • bearer token
    • azure_storage_token
  • managed identity
    • if using user assigned identity one of azure_client_id, azure_object_id, azure_msi_resource_id
    • if no other credential can be created, managed identity will be tried
  • workload identity
    • azure_client_id
    • azure_tenant_id
    • azure_federated_token_file

S3

The recommended url format is s3://<bucket>/<path> S3 storage always requires a region to be specified via one of aws_region or aws_default_region.

AWS supports virtual hosting of buckets, which can be configured by setting aws_virtual_hosted_style_request to "true".

When an alternative implementation or a mocked service like localstack is used, the service endpoint needs to be explicitly specified via aws_endpoint.

GCS

The recommended url format is gs://<bucket>/<path>.

  • service account
    • google_service_account

with pyarrow

from pathlib import Path

import numpy as np
import pyarrow as pa
import pyarrow.fs as fs
import pyarrow.dataset as ds
import pyarrow.parquet as pq

from object_store import ArrowFileSystemHandler

table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})

base = Path.cwd()
store = fs.PyFileSystem(ArrowFileSystemHandler(str(base.absolute())))

pq.write_table(table.slice(0, 5), "data/data1.parquet", filesystem=store)
pq.write_table(table.slice(5, 10), "data/data2.parquet", filesystem=store)

dataset = ds.dataset("data", format="parquet", filesystem=store)

Development

Prerequisites

Running tests

If you do not have just installed and do not wish to install it, have a look at the justfile to see the raw commands.

To set up the development environment, and install a dev version of the native package just run:

just init

This will also configure pre-commit hooks in the repository.

To run the rust as well as python tests:

just test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_store_python-0.1.2.tar.gz (34.0 kB view details)

Uploaded Source

Built Distributions

object_store_python-0.1.2-cp38-abi3-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.8+ Windows x86-64

object_store_python-0.1.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

object_store_python-0.1.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (5.2 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

object_store_python-0.1.2-cp38-abi3-macosx_11_0_arm64.whl (4.1 MB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

object_store_python-0.1.2-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (8.4 MB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

object_store_python-0.1.2-cp38-abi3-macosx_10_7_x86_64.whl (4.4 MB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file object_store_python-0.1.2.tar.gz.

File metadata

File hashes

Hashes for object_store_python-0.1.2.tar.gz
Algorithm Hash digest
SHA256 51ce60b76a789a56561fadceb2dc29b71530bf49a99fc4a02f57e5e188c48a29
MD5 c4b4fdca42b5f5f28f5c061851d6727f
BLAKE2b-256 6ae22968521c0878833f3ee2cccd60b2deb33442a2a0d1e43ad7f3184908845f

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.2-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.2-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 63fa43e30d97f85bcc57d68dad823f50dc097c7a4f73c8d890a2cc678ed0164c
MD5 fd0e332015251bf453dbdcb327908d34
BLAKE2b-256 2e2d9dbcb4be0c2e1749a9685f329cf22f816c91172131646d41f2da9ba12a6f

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 208b2e07bd8c75332c7dcd91ac0009d7e852984664a17b17b5b982956596a72d
MD5 40b546d13da2f136f9d5a5fba570278a
BLAKE2b-256 e3d71376a3480f0cce33c8f9b59581350fc4dfdd6906b83cb0222dbba6b94d02

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6e44294a05adc1262cf25453d3a526f53b59643978f43d8f5ca15606a6a13536
MD5 e5097a76de68f627b17cbfbf617e473c
BLAKE2b-256 617217715dbaae9892962771582257e5860b35e236bc5fdcf69a8b8abd596d1c

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.2-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.2-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b2db65ffbc91585835ee9e67e697dc54371940ba1b299c41f71548e516988c65
MD5 312d89b69e1c26ea0ea6cc3c781fde60
BLAKE2b-256 74da0b96e24a6d4a1aa04eaa671f586cd9b67651e6e7ec841fc919fb83dc1565

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.2-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.2-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 825ed83223b0936c9fa7f747efe481854b7b2cdb8cf3216ab9e34776e09492ce
MD5 a00b9513b8a9b02931acab9be386c67b
BLAKE2b-256 82829ab44689b2a0c22fef73a4b369d86dc644180857451b1ed63cf69e9d439b

See more details on using hashes here.

File details

Details for the file object_store_python-0.1.2-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for object_store_python-0.1.2-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 4141816238777298e830e4436815fc725ce143fa1aa83113abceb8463d94d6ab
MD5 9149967f513561ff565229bc5a14cd67
BLAKE2b-256 94a1d15ba8ada893b2df4cb01c6c1e15695cac3368b39c8b70d818d742229586

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page