Skip to main content

<object-storage-proxy ⚡> Yet Another Object Storage Proxy

Project description

CI PyPI - Downloads PyPI - version

<object-storage-proxy ⚡> Yet Another Object Storage Reverse Proxy

📌 Note: This project is still under heavy development, and its APIs are subject to change.

Introduction

A fast and safe reverse proxy server, based on Cloudflare's pingora, to reverse proxy IBM Cloud Object Storage buckets.

  • Takes a Python authorization and api_key fetch callback function and cos bucket dictionary.
  • The validation is cached with optional ttl.
  • The apikey is used to authenticate against IBM's IAM endpoint and is cached and renewed on expiration.
  • If no apikey is provided, a Python function can be passed in to fetch the apikey for any given bucket.
  • HMAC support: passing in access and secret id keys, will be used to sign the request

The bucket dict contains for each bucket:

- endpoint host
- port
- api key (optional)
- ttl (optional, default 300) -> keep this reasonably short, but size to your needs
cos_map = {
    "bucket1": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": apikey,
        "ttl": 0
    },
    "bucket2": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": apikey
    },
    "proxy-bucket01": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "ttl": 300
    }
}

secrets

IBM COS Storage is built in a way where buckets are grouped by a cos (Cloud Object Storage) instance. Access to a bucket is managed by either an api key or hmac secrets, configured on the cos instance.

endpoint

Each bucket has its own endpoint: <bucket_name>.s3..cloud-object-storage.appdomain.cloud:.

The port is not always different, though, but it might be. Depends on your implementation.

You can imagine managing multiple buckets across instances can become quite cumbersome, even with aws profiles etc.

solution

There are two ways to access a bucket: through virtual addressing style (bucket.ibm-cos-host:port) and path style (ibm-cos-host/bucket).

your client (aws s3 compatible) -> http(s)://this-proxy/bucket01 -> https://bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud:443

  1. translate path style to virtual style
  2. abstract authentication & authorization

Pass in a function which maps bucket to instance (credentials), and a function to map bucket to port (endpoint)

request lifecycle

authentication & authorization

The advantage is we can plug in a python authentication function and another function for authorization, allowing for fine-grained control.

authentication

We use the standard aws hmac header.

authorization

Pass in a callable from python which will be called from rust. This will be cached (ttl) for consequtive requests.

Examples

With local configuration.

~/.aws/config

[profile osp]
region = eu-west-3
output = json
services = pingora-services
s3 =
    addressing_style = path

[services osp-services]
s3 =
  endpoint_url = http://localhost:6190

~/.aws/credentials

[osp]
aws_access_key_id = MYLOCAL123
aws_secret_access_key = nothingmeaningful

Set up a minimal server implementation:

import os
import random
import object_storage_proxy as osp

from dotenv import load_dotenv

from object_storage_proxy import start_server, ProxyServerConfig


_TRUES  = {"y", "yes", "t", "true", "on", "1"}
_FALSES = {"n", "no", "f", "false", "off", "0"}


def strtobool(val: str) -> bool:
    """Convert a string to True/False, raise ValueError otherwise."""
    v = val.lower()
    if v in _TRUES:
        return True
    if v in _FALSES:
        return False
    raise ValueError(f"invalid truth value {val!r}")


def docreds(bucket) -> str:
    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")
    
    print(f"Fetching credentials for {bucket}...")
    return apikey

def do_validation(token: str, bucket: str) -> bool:
    print(f"PYTHON: Validating headers: {token} for {bucket}...")
    # return random.choice([True, False])
    return True


def main() -> None:
    load_dotenv()

    counting = strtobool(os.getenv("OSP_ENABLE_REQUEST_COUNTING", "false"))

    if counting:
        osp.enable_request_counting()
        print("Request counting enabled")


    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")

    cos_map = {
        "bucket1": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "port": 443,
            "apikey": apikey,
            "ttl": 0
        },
        "bucket2": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "port": 443,
            "apikey": apikey
        },
        "proxy-bucket01": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "port": 443,
            "ttl": 300
        }
    }

    ra = ProxyServerConfig(
        cos_map=cos_map,
        bucket_creds_fetcher=docreds,
        validator=do_validation,
        http_port=6190,
        https_port=8443
    )

    start_server(ra)


if __name__ == "__main__":
    main()

Run with aws-cli (but could be anything compatible with the aws s3 api like polars, spark, presto, ...):

$ aws s3 ls s3://proxy-bucket01/ --recursive --summarize --human-readable --profile osp
2025-04-17 17:45:30   33 Bytes README.md
2025-04-17 17:48:04   33 Bytes README2.md

Total Objects: 2
   Total Size: 66 Bytes
$

Server output:

$ uv run python test_server.py
2025-04-19T13:19:54.402023+02:00  INFO object_storage_proxy: Logger initialized; starting server on http port 6190 and https port 8443
2025-04-19T13:19:54.402361+02:00  INFO object_storage_proxy: Bucket creds fetcher provided: Py(0x100210680)
Fetching credentials for bucket01...
2025-04-19T13:19:54.402485+02:00  INFO object_storage_proxy: Callback returned: Kn2t...
[src/lib.rs:327:5] &run_args.cos_map = Py(
    0x000000010061aa00,
)
2025-04-19T13:19:54.403738+02:00  INFO pingora_core::server: Bootstrap starting
2025-04-19T13:19:54.403852+02:00  INFO pingora_core::server: Bootstrap done
2025-04-19T13:19:54.424489+02:00  INFO pingora_core::server: Server starting
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:19:58.124729+02:00  INFO object_storage_proxy::utils::validator: Callback returned: false
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:20:00.919320+02:00  INFO object_storage_proxy::utils::validator: Callback returned: true
2025-04-19T13:20:01.181775+02:00  INFO object_storage_proxy::credentials::secrets_proxy: No cached token found for proxy-bucket01, fetching ...
2025-04-19T13:20:01.181859+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetching bearer token for the API key
2025-04-19T13:20:01.739385+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Received access token
2025-04-19T13:20:01.739600+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetched new token for proxy-bucket01
2025-04-19T13:20:01.739668+02:00  INFO object_storage_proxy: Sending request to upstream: https://proxy-bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud/?list-type=2&prefix=&encoding-type=url
2025-04-19T13:20:01.739922+02:00  INFO object_storage_proxy: Request sent to upstream.

test

See the included python test script.

Create self-signed certificates and export the environment variables:

openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
        -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost"
export TLS_CERT_PATH=/full/path/cert.pem
export TLS_KEY_PATH=/full/path/key.pem

Status

  • pingora proxy implementation
  • pass in credentials handler
  • cache credentials
  • pass in bucket/instance and bucket/port config
  • split in workspace crate with core, cli and python crates (too many specifics for python)
  • config mgmt
  • cache authorization (with optional ttl)
  • http frontend
  • https frontend (supports HTTP/2)
  • configurable request counting
  • call the api key fetcher callback only once, save to cos map
  • interface to pingora server/service configuration (ie. #threads)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_storage_proxy-0.1.18.tar.gz (49.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

object_storage_proxy-0.1.18-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.1.18-cp313-cp313-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

object_storage_proxy-0.1.18-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.1.18-cp312-cp312-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

object_storage_proxy-0.1.18-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.1.18-cp311-cp311-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

object_storage_proxy-0.1.18-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file object_storage_proxy-0.1.18.tar.gz.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.18.tar.gz
Algorithm Hash digest
SHA256 585e103762591a83d30175ad17ee89fb57889d52267111ac6d285d2591fbbab7
MD5 b2152d8b536fbb6bb090fdb596573277
BLAKE2b-256 d58f6bbe953fb89ecf5bfacff8fcb4d6d78840c910dbc37519b33ddef2ddf5b6

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.18-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.18-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 da2782d674a50e6ba8b197074aa95bc33ec945437bc70a964499d78965c952d4
MD5 b18e74c71ee39ba803572c0f030326da
BLAKE2b-256 df2206ca5ca4dac258525da4f794c1700147cb93284447406b14d3db272d9bfa

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.18-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.18-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 74e51a40ecde911b8f0060446cad686716c06e880b3cbf8aff4eabbf6170df75
MD5 514d920c5d89c46c9962bd6559cbc52c
BLAKE2b-256 04859ba75875be060b457b03b038f68036f8592a7a514d12c1cd44d976223cba

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.18-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.18-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c1201b1ffaca7e49951aa365df9346b3b41fb8c938c96caf151cf9c82e585b93
MD5 ffd70423b53f876956173b2bb852af23
BLAKE2b-256 4fae7398d2fb3815f5884714f5ded274f0f03b8d2ca1daf6bf6db6e5669cad5c

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.18-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.18-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 7c5695a72853987f31cad8259af826ffb4574d36065e6cf3162c0c81b29628f2
MD5 558c9bc73fa6c6d3e811867db3649ee0
BLAKE2b-256 72440d5151d54e95a0b82cf6a79eda20433fe27da72ca8a134518441c97dbb58

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.18-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.18-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b1493fdbb2e54a836ccfeb3f98d8fcc0aaaca5898e6ccd1b25a8199db375590c
MD5 b5ecb7ce0130bc4cec8b016926ea50b9
BLAKE2b-256 4e6087e12983f0b244a7e8c96a215b29cc30a4d3e46d6322de7b69da914e8b68

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.18-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.18-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e270f48601061aba7880ef05cca261e5320b5ecd915e7e7142d2223c97c78c1f
MD5 1b9184ec5d8ae0adf786a5ec2497db93
BLAKE2b-256 9cbc42bdde0ffc0f1258dba01ae68ff49153df57c53e013f055fec0f878ded47

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.18-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.18-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b0e3cd3a5fb197ec3a678ff5a0484b532bcd66be87405566aec12ccc54630962
MD5 bfa0f12ab3c53d7cadee50257646db2b
BLAKE2b-256 e862f5db59331fd456e0f356706b7c7c25f497f01b0a353aab0e496f9ecf6626

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.18-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.18-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 5087b3cca902ec68804df1eff1108d882cfb756dc0da063a8360037f3ad462ac
MD5 8e9ba808e800fab1309b42d35467871c
BLAKE2b-256 a3f4631ee70b96d384923bccbddb5f094ebf5061c2208c472993e4e6077aed79

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.18-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.18-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 69e75dc989210d6ee79309b43e281b016e4ba3f635198638707720842d4f0dad
MD5 693611ec6feb3b886eabcfaf828ccd67
BLAKE2b-256 407fa5cc7dff80274ed5c2dea7eb6dff01f23df4e37fd88f95ac436313e427cf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page