Skip to main content

<object-storage-proxy ⚡> Yet Another Object Storage Proxy

Project description

CI PyPI - Downloads PyPI - version

<object-storage-proxy ⚡> Yet Another Object Storage Reverse Proxy

📌 Note: This project is still under heavy development, and its APIs are subject to change.

Introduction

A fast and safe reverse proxy server, based on Cloudflare's pingora, to reverse proxy IBM Cloud Object Storage buckets.

  • Takes a Python authorization and api_key fetch callback function and cos bucket dictionary.
  • The validation is cached with optional ttl.
  • The apikey is used to authenticate against IBM's IAM endpoint and is cached and renewed on expiration.
  • If no apikey is provided, a Python function can be passed in to fetch the apikey for any given bucket (run once).
  • HMAC support: passing in access and secret id keys, will be used to sign the request

The bucket dict contains for each bucket:

- endpoint host
- port
- api key (optional)
- ttl (optional, default 300) -> keep this reasonably short, but size to your needs
cos_map = {
    "bucket1": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": apikey,
        "ttl": 0
    },
    "bucket2": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": apikey
    },
    "proxy-bucket01": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "ttl": 300
    }
}

secrets

IBM COS Storage is built in a way where buckets are grouped by a cos (Cloud Object Storage) instance. Access to a bucket is managed by either an api key or hmac secrets, configured on the cos instance.

endpoint

Each bucket has its own endpoint: <bucket_name>.s3..cloud-object-storage.appdomain.cloud:.

The port is not always different, though, but it might be. Depends on your implementation.

You can imagine managing multiple buckets across instances can become quite cumbersome, even with aws profiles etc.

solution

There are two ways to access a bucket: through virtual addressing style (bucket.ibm-cos-host:port) and path style (ibm-cos-host/bucket).

your client (aws s3 compatible) -> http(s)://this-proxy/bucket01 -> https://bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud:443

  1. translate path style to virtual style
  2. abstract authentication & authorization

Pass in a function which maps bucket to instance (credentials), and a function to map bucket to port (endpoint)

request lifecycle

authentication & authorization

The advantage is we can plug in a python authentication function and another function for authorization, allowing for fine-grained control.

authentication

We use the standard aws hmac header.

authorization

Pass in a callable from python which will be called from rust. This will be cached (ttl) for consequtive requests.

Examples

With local configuration.

~/.aws/config

[profile osp]
region = eu-west-3
output = json
services = pingora-services
s3 =
    addressing_style = path

[services osp-services]
s3 =
  endpoint_url = http://localhost:6190

~/.aws/credentials

[osp]
aws_access_key_id = MYLOCAL123
aws_secret_access_key = nothingmeaningful

Set up a minimal server implementation:

import os
import random
import object_storage_proxy as osp

from dotenv import load_dotenv

from object_storage_proxy import start_server, ProxyServerConfig


_TRUES  = {"y", "yes", "t", "true", "on", "1"}
_FALSES = {"n", "no", "f", "false", "off", "0"}


def strtobool(val: str) -> bool:
    """Convert a string to True/False, raise ValueError otherwise."""
    v = val.lower()
    if v in _TRUES:
        return True
    if v in _FALSES:
        return False
    raise ValueError(f"invalid truth value {val!r}")


def docreds(bucket) -> str:
    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")
    
    print(f"Fetching credentials for {bucket}...")
    return apikey

def do_validation(token: str, bucket: str) -> bool:
    print(f"PYTHON: Validating headers: {token} for {bucket}...")
    # return random.choice([True, False])
    return True


def main() -> None:
    load_dotenv()

    counting = strtobool(os.getenv("OSP_ENABLE_REQUEST_COUNTING", "false"))

    if counting:
        osp.enable_request_counting()
        print("Request counting enabled")


    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")

    cos_map = {
        "bucket1": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "port": 443,
            "apikey": apikey,
            "ttl": 0
        },
        "bucket2": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "port": 443,
            "apikey": apikey
        },
        "proxy-bucket01": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "port": 443,
            "ttl": 300
        }
    }

    ra = ProxyServerConfig(
        cos_map=cos_map,
        bucket_creds_fetcher=docreds,
        validator=do_validation,
        http_port=6190,
        https_port=8443
    )

    start_server(ra)


if __name__ == "__main__":
    main()

Run with aws-cli (but could be anything compatible with the aws s3 api like polars, spark, presto, ...):

$ aws s3 ls s3://proxy-bucket01/ --recursive --summarize --human-readable --profile osp
2025-04-17 17:45:30   33 Bytes README.md
2025-04-17 17:48:04   33 Bytes README2.md

Total Objects: 2
   Total Size: 66 Bytes
$

Server output:

$ uv run python test_server.py
2025-04-19T13:19:54.402023+02:00  INFO object_storage_proxy: Logger initialized; starting server on http port 6190 and https port 8443
2025-04-19T13:19:54.402361+02:00  INFO object_storage_proxy: Bucket creds fetcher provided: Py(0x100210680)
Fetching credentials for bucket01...
2025-04-19T13:19:54.402485+02:00  INFO object_storage_proxy: Callback returned: Kn2t...
[src/lib.rs:327:5] &run_args.cos_map = Py(
    0x000000010061aa00,
)
2025-04-19T13:19:54.403738+02:00  INFO pingora_core::server: Bootstrap starting
2025-04-19T13:19:54.403852+02:00  INFO pingora_core::server: Bootstrap done
2025-04-19T13:19:54.424489+02:00  INFO pingora_core::server: Server starting
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:19:58.124729+02:00  INFO object_storage_proxy::utils::validator: Callback returned: false
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:20:00.919320+02:00  INFO object_storage_proxy::utils::validator: Callback returned: true
2025-04-19T13:20:01.181775+02:00  INFO object_storage_proxy::credentials::secrets_proxy: No cached token found for proxy-bucket01, fetching ...
2025-04-19T13:20:01.181859+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetching bearer token for the API key
2025-04-19T13:20:01.739385+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Received access token
2025-04-19T13:20:01.739600+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetched new token for proxy-bucket01
2025-04-19T13:20:01.739668+02:00  INFO object_storage_proxy: Sending request to upstream: https://proxy-bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud/?list-type=2&prefix=&encoding-type=url
2025-04-19T13:20:01.739922+02:00  INFO object_storage_proxy: Request sent to upstream.

test

See the included python test script.

Create self-signed certificates and export the environment variables:

openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
        -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost"
export TLS_CERT_PATH=/full/path/cert.pem
export TLS_KEY_PATH=/full/path/key.pem

Status

  • pingora proxy implementation
  • pass in credentials handler
  • cache credentials
  • pass in bucket/instance and bucket/port config
  • split in workspace crate with core, cli and python crates (too many specifics for python)
  • config mgmt
  • cache authorization (with optional ttl)
  • http frontend
  • https frontend (supports HTTP/2)
  • configurable request counting
  • call the api key fetcher callback only once, save to cos map
  • config for #threads in ProxyServerConfig

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_storage_proxy-0.1.19.tar.gz (50.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

object_storage_proxy-0.1.19-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.1.19-cp313-cp313-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

object_storage_proxy-0.1.19-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.1.19-cp312-cp312-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

object_storage_proxy-0.1.19-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.1.19-cp311-cp311-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

object_storage_proxy-0.1.19-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file object_storage_proxy-0.1.19.tar.gz.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.19.tar.gz
Algorithm Hash digest
SHA256 db16da212e008d6608b2f6a903c2c83b8662d6c8de7c4699831fef79d19f63a6
MD5 e0f5110e47ff04b0803acdf98d27d6c0
BLAKE2b-256 825c9438fcd5c2bbd67982703298a3322853cd89c6490afc70bdec3e9ab3c0cf

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.19-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.19-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e6a63e23654e5a9f2e4deaf077acea4ef2e1f1a0e1b679234061a8f6dc7725c5
MD5 9cf0a32d9caddc0db321c2c8c5066a78
BLAKE2b-256 0fe05895ea2316b9346ed9d6f96f63f83c56916d1a77b9ec352d0af863606d3f

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.19-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.19-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 79f613f42c7c480b93e0115ef6d87ed7f4876aff056c5252463574251c66e612
MD5 5e1b2574af7784bb74ded20fb9a43a2f
BLAKE2b-256 e0125d9299c96e1a3de06e19eaf3a83e42c68b6c1f7e56a1a13b61f229ba6c58

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.19-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.19-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2cd55b00db1ee8f0e4bb7196e32d59ad4e40477151dbbf2f33c2007d9085e3be
MD5 36476c23c957050b585ea7dd44ac4815
BLAKE2b-256 290a80fd209e445b5a6b9e41b389585b6d1dec8f96889e4dfffa01bc1bb2db5d

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.19-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.19-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 44a87e1f0ac1c420d218096af257595405507fcc934826c6eab37613619d16f7
MD5 a51c2a906a8fae82833d4b102f45c670
BLAKE2b-256 327348a106865a7b9b4ff5e7220b51d2b31f41770d0f8bb1009b5a5c9b79d289

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.19-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.19-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 dcedbaf19462df3a00c0e4108e6c09c17aba4e32f06f2d2c619099be240144ab
MD5 451bc9dcb065b760d94b9cbe5e39965f
BLAKE2b-256 d3583d784142e564cd84dfc0b0d8f54454be261319754f01af96f19e4f3b9b64

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.19-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.19-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 8ee80f2a0540857be9c608ea5500f4d666e18fe5a4dd121b4012dfade101589d
MD5 82600975124f8212f85488a04309dafe
BLAKE2b-256 c5ddeeea4cd5f42ea8c9f876c4a29895435663b212506bcf045f6dc9724f8071

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.19-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.19-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 90564dbfd757f37cc32b4df25f83a2a4f9337f302bd471598e082b4d3ecf0f48
MD5 077f41b13582185ccc8522169c39467a
BLAKE2b-256 435a0bedae281e48eaa2190314c4572d0306839e692878b80b135a06ddc1b7f3

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.19-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.19-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 0897cdae56c0d96f4d76c8fb89a6a31654821706a004e69392da31f32d167e23
MD5 c4e418e31b7011f318ed9cf18685077a
BLAKE2b-256 145d37d998c84bb8c3478e04e52b558a5a6c0a4c1ea5d9e18f4620632cb8fc80

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.19-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.19-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 533e7122edc8d3c5287aef852c48486644626934a9bc6c37878fcfbfbebdbb95
MD5 c046d218935343947e753d0400b90262
BLAKE2b-256 1452692fc9b6e1b5a4cae1161ba099b3274f651501635e6f98eceb280a0ad647

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page