Object storage interface definitions for Python.

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
- Python :: 3 :: Only

Project description

obspec

Object storage protocol definitions for Python.

Background

Python defines two types of subtyping: nominal and structural subtyping. In essence, nominal subtyping is subclassing. Class A is a nominal subtype of class B if A subclasses from B. Structural subtyping is duck typing. Class A is a structural subtype of class B if A "looks like" B, that is, it conforms to the same shape as B.

Using structural subtyping means that an ecosystem of libraries don't need to have any knowledge or dependency on each other, as long as they strictly and accurately implement the same duck-typed interface.

For example, an Iterable is a protocol. You don't need to subclass from a base Iterable class in order to make your type iterable. Instead, if you define an __iter__ dunder method on your class, it automatically becomes iterable because Python has a convention that if you see an __iter__ method, you can call it to iterate over a sequence.

As another example, the Buffer Protocol is a protocol to enable zero-copy exchange of binary data between Python libraries. Unlike Iterable, this is a protocol that is inaccessible in user Python code and only accessible at the C level, but it's still a protocol. Numpy can create arrays that view a buffer via the buffer protocol, even when Numpy has no prior knowledge of the library that produces the buffer.

Obspec defines core protocols to interface with data stored on file systems, remote object stores, etc.

Usage

You should use the minimal methods required for your use case, creating your own protocol with just what you need.

In particular, Python allows you to intersect protocols:

from typing import Protocol

from obspec import Delete, Get, List, Put


class MyCustomObspecProtocol(Delete, Get, List, Put, Protocol):
    """My custom protocol."""

Then use that protocol generically:

def do_something(backend: MyCustomObspecProtocol):
    backend.put("path.txt", b"hello world!")

    files = backend.list().collect()
    assert any(file["path"] == "path.txt" for file in files)

    assert backend.get("path.txt").bytes() == b"hello world!"

    backend.delete("path.txt")

    files = backend.list().collect()
    assert not any(file["path"] == "path.txt" for file in files)

In particular, by defining the most minimal interface you require, it widens the set of possible backends that can implement your interface. For example, making a range request is possible by any HTTP client, but a list call may have semantics not defined in the HTTP specification. So by only requiring, say, Get and GetRange you allow more implementations to be used with your program.

Example: Cloud-Optimized GeoTIFF reader

A Cloud-Optimized GeoTIFF (COG) reader might only require range requests

from typing import Protocol

from obspec import GetRange, GetRanges


class CloudOptimizedGeoTiffReader(GetRange, GetRanges, Protocol):
    """Protocol with necessary methods to read a Cloud-Optimized GeoTIFF file."""


def read_cog_header(backend: CloudOptimizedGeoTiffReader, path: str):
    # Make request for first 32KB of file
    header_bytes = backend.get_range(path, start=0, end=32 * 1024)

    # TODO: parse information from header
    raise NotImplementedError


def read_cog_image(backend: CloudOptimizedGeoTiffReader, path: str):
    header = read_cog_header(backend, path)

    # TODO: read image data from file.

An async Cloud-Optimized GeoTIFF reader might instead subclass from obspec's async methods:

from typing import Protocol

from obspec import GetRangeAsync, GetRangesAsync


class AsyncCloudOptimizedGeoTiffReader(GetRangeAsync, GetRangesAsync, Protocol):
    """Necessary methods to asynchronously read a Cloud-Optimized GeoTIFF file."""


async def read_cog_header(backend: AsyncCloudOptimizedGeoTiffReader, path: str):
    # Make request for first 32KB of file
    header_bytes = await backend.get_range_async(path, start=0, end=32 * 1024)

    # TODO: parse information from header

    raise NotImplementedError


async def read_cog_image(backend: AsyncCloudOptimizedGeoTiffReader, path: str):
    header = await read_cog_header(backend, path)

    # TODO: read image data from file.

Implementations

The primary implementation that implements obspec is obstore, and the obspec protocol was designed around the obstore API.

Utilities

There are planned to be utilities that build on top of obspec. Potentially:

globbing: an implementation of glob() similar to fsspec.glob that uses obspec primitives.
Caching: wrappers around Get/GetRange/GetRanges that store a cache of bytes.

By having these utilities operate on generic obspec protocols, it means that they can instantly be used with any future obspec backend.

Project details

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
- Python :: 3 :: Only

Release history Release notifications | RSS feed

0.1.0

Jun 25, 2025

This version

0.1.0b6 pre-release

May 28, 2025

0.1.0b5 pre-release

May 21, 2025

0.1.0b4 pre-release

May 20, 2025

0.1.0b3 pre-release

Mar 13, 2025

0.1.0b2 pre-release

Mar 13, 2025

0.1.0b1 pre-release

Mar 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

obspec-0.1.0b6.tar.gz (111.6 kB view details)

Uploaded May 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

obspec-0.1.0b6-py3-none-any.whl (14.8 kB view details)

Uploaded May 28, 2025 Python 3

File details

Details for the file obspec-0.1.0b6.tar.gz.

File metadata

Download URL: obspec-0.1.0b6.tar.gz
Upload date: May 28, 2025
Size: 111.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.5

File hashes

Hashes for obspec-0.1.0b6.tar.gz
Algorithm	Hash digest
SHA256	`df1ffa5517a4b69430f934c6a32511ac566ee0d662dc8325a763540bc0ae652b`
MD5	`12e73776400590848a7509a4faa42893`
BLAKE2b-256	`b73ed5039e58371e8ebb2884f0ee1072c8b9119950bfe3504daaf812db2d1776`

See more details on using hashes here.

File details

Details for the file obspec-0.1.0b6-py3-none-any.whl.

File metadata

Download URL: obspec-0.1.0b6-py3-none-any.whl
Upload date: May 28, 2025
Size: 14.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.5

File hashes

Hashes for obspec-0.1.0b6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dfcfbf61b8512d938716935b24bec2350f9ae489d1fdfde660b7009c76e726b0`
MD5	`28cae4b971f539a741e0a12b25feac43`
BLAKE2b-256	`c1097d3cfeadeb44333051b63cd5acd5315c1848cc49329cf8af880f60e9fd75`

See more details on using hashes here.

obspec 0.1.0b6

Navigation

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

obspec

Background

Usage

Example: Cloud-Optimized GeoTIFF reader

Implementations

Utilities

Project details

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes