Skip to main content

<object-storage-proxy ⚡> Yet Another Object Storage Proxy

Project description

CI PyPI - Downloads PyPI - version

<object-storage-proxy ⚡> Yet Another Object Storage Reverse Proxy

📌 Note: This project is still under heavy development, and its APIs are subject to change.

Introduction

A fast and safe reverse proxy server, based on Cloudflare's pingora, to reverse proxy IBM Cloud Object Storage buckets.

  • Takes a Python authorization and api_key fetch callback function and cos bucket dictionary.
  • The validation is cached with optional ttl.
  • The apikey is used to authenticate against IBM's IAM endpoint and is cached and renewed on expiration.
  • If no apikey is provided, a Python function can be passed in to fetch the apikey for any given bucket (run once).
  • HMAC support: passing in access and secret id keys, will be used to sign the request

The bucket dict contains for each bucket:

- endpoint host
- port
- api key (optional)
- ttl (optional, default 300) -> keep this reasonably short, but size to your needs
cos_map = {
    "bucket1": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": apikey,
        "ttl": 0
    },
    "bucket2": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": apikey
    },
    "proxy-bucket01": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "ttl": 300
    }
}

secrets

IBM COS Storage is built in a way where buckets are grouped by a cos (Cloud Object Storage) instance. Access to a bucket is managed by either an api key or hmac secrets, configured on the cos instance.

endpoint

Each bucket has its own endpoint: <bucket_name>.s3..cloud-object-storage.appdomain.cloud:.

The port is not always different, though, but it might be. Depends on your implementation.

You can imagine managing multiple buckets across instances can become quite cumbersome, even with aws profiles etc.

solution

There are two ways to access a bucket: through virtual addressing style (bucket.ibm-cos-host:port) and path style (ibm-cos-host/bucket).

your client (aws s3 compatible) -> http(s)://this-proxy/bucket01 -> https://bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud:443

  1. translate path style to virtual style
  2. abstract authentication & authorization

Pass in a function which maps bucket to instance (credentials), and a function to map bucket to port (endpoint)

request lifecycle

authentication & authorization

The advantage is we can plug in a python authentication function and another function for authorization, allowing for fine-grained control.

authentication

We use the standard aws hmac header.

authorization

Pass in a callable from python which will be called from rust. This will be cached (ttl) for consequtive requests.

Examples

With local configuration.

~/.aws/config

[profile osp]
region = eu-west-3
output = json
services = pingora-services
s3 =
    addressing_style = path

[services osp-services]
s3 =
  endpoint_url = http://localhost:6190

~/.aws/credentials

[osp]
aws_access_key_id = MYLOCAL123
aws_secret_access_key = nothingmeaningful

Set up a minimal server implementation:

import os
import random
import object_storage_proxy as osp

from dotenv import load_dotenv

from object_storage_proxy import start_server, ProxyServerConfig


_TRUES  = {"y", "yes", "t", "true", "on", "1"}
_FALSES = {"n", "no", "f", "false", "off", "0"}


def strtobool(val: str) -> bool:
    """Convert a string to True/False, raise ValueError otherwise."""
    v = val.lower()
    if v in _TRUES:
        return True
    if v in _FALSES:
        return False
    raise ValueError(f"invalid truth value {val!r}")


def docreds(bucket) -> str:
    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")
    
    print(f"Fetching credentials for {bucket}...")
    return apikey

def do_validation(token: str, bucket: str) -> bool:
    print(f"PYTHON: Validating headers: {token} for {bucket}...")
    # return random.choice([True, False])
    return True


def main() -> None:
    load_dotenv()

    counting = strtobool(os.getenv("OSP_ENABLE_REQUEST_COUNTING", "false"))

    if counting:
        osp.enable_request_counting()
        print("Request counting enabled")


    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")

    cos_map = {
        "bucket1": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "port": 443,
            "apikey": apikey,
            "ttl": 0
        },
        "bucket2": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "port": 443,
            "apikey": apikey
        },
        "proxy-bucket01": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "port": 443,
            "ttl": 300
        }
    }

    ra = ProxyServerConfig(
        cos_map=cos_map,
        bucket_creds_fetcher=docreds,
        validator=do_validation,
        http_port=6190,
        https_port=8443
    )

    start_server(ra)


if __name__ == "__main__":
    main()

Run with aws-cli (but could be anything compatible with the aws s3 api like polars, spark, presto, ...):

$ aws s3 ls s3://proxy-bucket01/ --recursive --summarize --human-readable --profile osp
2025-04-17 17:45:30   33 Bytes README.md
2025-04-17 17:48:04   33 Bytes README2.md

Total Objects: 2
   Total Size: 66 Bytes
$

Server output:

$ uv run python test_server.py
2025-04-19T13:19:54.402023+02:00  INFO object_storage_proxy: Logger initialized; starting server on http port 6190 and https port 8443
2025-04-19T13:19:54.402361+02:00  INFO object_storage_proxy: Bucket creds fetcher provided: Py(0x100210680)
Fetching credentials for bucket01...
2025-04-19T13:19:54.402485+02:00  INFO object_storage_proxy: Callback returned: Kn2t...
[src/lib.rs:327:5] &run_args.cos_map = Py(
    0x000000010061aa00,
)
2025-04-19T13:19:54.403738+02:00  INFO pingora_core::server: Bootstrap starting
2025-04-19T13:19:54.403852+02:00  INFO pingora_core::server: Bootstrap done
2025-04-19T13:19:54.424489+02:00  INFO pingora_core::server: Server starting
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:19:58.124729+02:00  INFO object_storage_proxy::utils::validator: Callback returned: false
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:20:00.919320+02:00  INFO object_storage_proxy::utils::validator: Callback returned: true
2025-04-19T13:20:01.181775+02:00  INFO object_storage_proxy::credentials::secrets_proxy: No cached token found for proxy-bucket01, fetching ...
2025-04-19T13:20:01.181859+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetching bearer token for the API key
2025-04-19T13:20:01.739385+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Received access token
2025-04-19T13:20:01.739600+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetched new token for proxy-bucket01
2025-04-19T13:20:01.739668+02:00  INFO object_storage_proxy: Sending request to upstream: https://proxy-bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud/?list-type=2&prefix=&encoding-type=url
2025-04-19T13:20:01.739922+02:00  INFO object_storage_proxy: Request sent to upstream.

test

See the included python test script.

Create self-signed certificates and export the environment variables:

openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
        -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost"
export TLS_CERT_PATH=/full/path/cert.pem
export TLS_KEY_PATH=/full/path/key.pem

Status

  • pingora proxy implementation
  • pass in credentials handler
  • cache credentials
  • pass in bucket/instance and bucket/port config
  • split in workspace crate with core, cli and python crates (too many specifics for python)
  • config mgmt
  • cache authorization (with optional ttl)
  • http frontend
  • https frontend (supports HTTP/2)
  • configurable request counting
  • call the api key fetcher callback only once, save to cos map
  • config for #threads in ProxyServerConfig

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_storage_proxy-0.1.20.tar.gz (50.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

object_storage_proxy-0.1.20-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.1.20-cp313-cp313-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

object_storage_proxy-0.1.20-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.1.20-cp312-cp312-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

object_storage_proxy-0.1.20-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.1.20-cp311-cp311-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

object_storage_proxy-0.1.20-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file object_storage_proxy-0.1.20.tar.gz.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.20.tar.gz
Algorithm Hash digest
SHA256 c1c62ab9f03c85360b22c085b8d32ae58b943dc13e050fef54393ce3d8499751
MD5 55cbddf366baeff8da3bcddc8b63e287
BLAKE2b-256 c61fcfeeaf05bed3b62aa5be79c4e4512f3df87c597cee115e6fb213ae7a0fb5

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.20-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.20-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e451f21c7c7b08f8a3cb5cd433015b8e60be181cec264a0e8650404f96729c79
MD5 c90644c191046025a6ee5665c2c23ba9
BLAKE2b-256 0ff9d67d80d5a8ec3a94c207d0091080c2552a0a88ceef6d5eabf7be9591e345

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.20-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.20-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8392b17522498cad1be727bdd1b47e831fb7efa258b00d48bcc5dec5472b9996
MD5 de5fd20ce6d400430bc576e26ea1833d
BLAKE2b-256 60e5b3311b52005db2dd61539158cb22c0c073b18f65165be1beaa3f38c9bb6b

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.20-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.20-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d9665336f14eeb85692549e8bb7eb356fcd343d1f317e09e0b3a4510058f8f7b
MD5 d2929a0214a051e5296344e4b2c342a6
BLAKE2b-256 2e1f14b9c1b0c807f93808eedf75a9433877aa37677665d9a9f45f95658d4650

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.20-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.20-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 996a1fdb1a0dc0f5534832a57070ec2dfd7a808c702c22530782c2bdb131a426
MD5 745af81eba0a8e34582b31bc1af35e43
BLAKE2b-256 496440aa98ea0479f19c38421c88ecec6955a63271ccdf4ce0237b76832ae27e

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.20-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.20-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bfacf50bcfed47f29bfe3d1c00b73d114eb3d1479148b16b933fec9991de893e
MD5 fa82a1efd509bf9e25869c2e419ad970
BLAKE2b-256 b32e1e5c079bd87cb864fa0d046304e3f953599d56b0fe1e0e4912b61fc33302

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.20-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.20-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b8139cdb679f04b9c3a7f607de791ef545cf37799eea85421c0195199a52807b
MD5 f50915c5321a084687fc2105582da7de
BLAKE2b-256 02594ad666c72ba97d6729251f620f2d95a33c7059f9b600a0bf375a99c618df

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.20-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.20-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 45b2169b36d922fde44f6a2aa30ac2c38fae659c78fa3f686f1629fb306db7d7
MD5 21b38fc608d2fc275a7d4ae58abff12e
BLAKE2b-256 f4044a0a0f228fe881c8655834953559258390ff3620b845723c9e4179721367

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.20-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.20-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e44c7f66b26462144466374488c02aaf898cf6f9ca3e4753872bec443956c00f
MD5 fe51b59ed059220183deb7643ea7cfa5
BLAKE2b-256 13e4b9b018e074fc4524d016e10c67d9ee8bde20addbb070b870289cd432e89c

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.1.20-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.1.20-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0b3f40ca62c46205724b6a9e33da50126d3eed947adeadea2774c08c85d53651
MD5 35a79a0336f3bedc56647ba1e90bd176
BLAKE2b-256 7b02a6279a1ece83e63cda7283f07588c3632ca9667578b3f048c00bf100b9cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page