Skip to main content

<object-storage-proxy ⚡> Yet Another Object Storage Proxy

Project description

CI PyPI - Downloads PyPI - version

<object-storage-proxy ⚡> Yet Another Object Storage Reverse Proxy

📌 Note: This project is still under heavy development, and its APIs are subject to change.

Introduction

A fast and safe in-process reverse proxy server, based on Cloudflare's pingora, to reverse proxy AWS and IBM Cloud Object Storage buckets and integrate your Authentication and Authorization services.

Decouples frontend from backend authentication and authorization.

  • Takes a Python authorization callable (allows you to plug in your own authorization services) and api_key fetch callback function and cos bucket dictionary.
  • The validation is cached with optional ttl (default 5min, keep it short).
  • The apikey is used to authenticate against IBM's IAM endpoint and is cached and renewed on expiration. (IBM only)
  • If no apikey or hmac keypair is provided, a Python function can be passed in to fetch the apikey or hmac keys for any given bucket (run once).
  • HMAC support: passing in access and secret id keys (or as json string from python credentials callable), will be used to sign the upstream request (AWS/IBM/..)
  • frontend request is signed with your own custom keypair
  • supports running in cloud environments / proxy headers for request signature validation

The bucket dict contains for each bucket:

- endpoint host
- port
- api key (optional)
- hmac access key (optional)
- hmac secret key (optional)
- ttl (optional, default 300) -> keep this reasonably short, but size to your needs
cos_map = {
    "bucket1": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "ttl": 0
    },
    "bucket2": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": "apikey"
    },
    "proxy-bucket01": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "access_key": "<redacted>",
        "secret_key": "<redacted>",
        "ttl": 300
    },
    "proxy-aws-bucket01": {
        "host": "s3.eu-west-3.amazonaws.com",
        "region": "eu-west-3",
        "access_key": os.getenv("AWS_ACCESS_KEY"),
        "secret_key": os.getenv("AWS_SECRET_KEY"),
        "port": 443,
        "ttl": 300
    }    
}

The Python callables take two arguments:

- token: parsed from the original aws request's authorization header
- bucket: parsed from the uri path
    def your_credentials_fetcher(token: str, bucket: str) -> str
    def your_request_authorizer(token: str, bucket: str) -> bool

Proxy Configuration

secrets

IBM COS Storage is built in a way where buckets are grouped by a cos (Cloud Object Storage) instance. Access to a bucket is managed by either an api key or hmac secrets, configured on the cos instance.

endpoint

Each bucket has its own endpoint: <bucket_name>.s3..cloud-object-storage.appdomain.cloud:.

The port is not always different, though, but it might be. Depends on your implementation.

You can imagine managing multiple buckets across instances can become quite cumbersome, even with aws profiles etc.

solution

There are two ways to access a bucket: through virtual addressing style (bucket.ibm-cos-host:port) and path style (ibm-cos-host/bucket).

your client (aws s3 compatible) -> http(s)://this-proxy/bucket01 -> https://bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud:443

  1. translate path style to virtual style
  2. abstract authentication & authorization

Pass in a function which maps bucket to instance (credentials), and a function to map bucket to port (endpoint)

request lifecycle

request stages

authentication & authorization

The advantage is we can plug in a python authentication function and another function for authorization, allowing for fine-grained control.

authentication

We use the standard aws hmac header and aws v4 request signing algorithm.

authorization

Pass in a callable from python which will be called from rust. This will be cached (ttl) for consecutive requests.

Examples

With local configuration.

~/.aws/config

[profile osp]
region = eu-west-3
output = json
services = osp-services
s3 =
    addressing_style = path

[services osp-services]
s3 =
  endpoint_url = http://localhost:6190

~/.aws/credentials

[osp]
aws_access_key_id = MYLOCAL123  # <-- this could be an openid connect/oauth2 token or anything that makes sense for your business, encode it if required
aws_secret_access_key = nothingmeaningful # <-- used for compatibility with aws sdk, to sign original request, but is ignored later

Set up a minimal server implementation:

import json
import os
import random
import object_storage_proxy as osp

from dotenv import load_dotenv

from object_storage_proxy import start_server, ProxyServerConfig


_TRUES  = {"y", "yes", "t", "true", "on", "1"}
_FALSES = {"n", "no", "f", "false", "off", "0"}


def strtobool(val: str) -> bool:
    """Convert a string to True/False, raise ValueError otherwise."""
    v = val.lower()
    if v in _TRUES:
        return True
    if v in _FALSES:
        return False
    raise ValueError(f"invalid truth value {val!r}")


def do_api_creds(token: str, bucket: str) -> str:
    """Fetch credentials (ro, rw, access_denied) for the given bucket, depending on the token. """
    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")
    
    print(f"Fetching credentials for {bucket}...")
    return apikey


def do_hmac_creds(token: str, bucket: str) -> str:
    """ Fetch HMAC credentials (ro, rw, access_denied) for the given bucket, depending on the token """
    access_key = os.getenv("ACCESS_KEY")
    secret_key = os.getenv("SECRET_KEY")
    if not access_key or not secret_key:
        raise ValueError("ACCESS_KEY or SECRET_KEY environment variable not set")
        
    print(f"Fetching HMAC credentials for {bucket}...")

    return json.dumps({
        "access_key": access_key,
        "secret_key": secret_key
    })

def lookup_secret_key(access_key: str) -> str | None:
    # get all environment variables ending in ACCESS_KEY
    access_keys = [{key:value} for key, value in os.environ.items() if key.endswith("ACCESS_KEY") and value==access_key ]

    if len(access_keys) > 0:
        access_key_var = next((k for k, v in access_keys[0].items() if v == access_key), None)

        secret_key_var = access_key_var.replace("ACCESS_KEY", "SECRET_KEY")
        return os.getenv(secret_key_var, None)
    else:
        print(f"no access keys found for : {access_key}")


def do_validation(token: str, bucket: str) -> bool:
    """ Authorize the request based on token for the given bucket. 
        You can plug in your own authorization service here.
        The token is the authorization token passed in the request.
        The bucket is the bucket name.
        The function should return True if the request is authorized, False otherwise.
    """

    print(f"PYTHON: Validating headers: {token} for {bucket}...")
    # return random.choice([True, False])
    return True


def main() -> None:
    load_dotenv()

    counting = strtobool(os.getenv("OSP_ENABLE_REQUEST_COUNTING", "false"))

    if counting:
        osp.enable_request_counting()
        print("Request counting enabled")

    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")

    cos_map = {
        "bucket1": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey,
            "ttl": 0
        },
        "bucket2": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey
        },
        "proxy-bucket01": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            # "access_key": os.getenv("ACCESS_KEY"),
            # "secret_key": os.getenv("SECRET_KEY"),
            "port": 443,
            "ttl": 300
        },
        "proxy-bucket05": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "access_key": os.getenv("PROXY_BUCKET05_ACCESS_KEY"),
            "secret_key": os.getenv("PROXY_BUCKET05_SECRET_KEY"),
            "port": 443,
            "ttl": 300
        },
        "proxy-aws-bucket01": {
            "host": "s3.eu-west-3.amazonaws.com",
            "region": "eu-west-3",
            "access_key": os.getenv("AWS_ACCESS_KEY"),
            "secret_key": os.getenv("AWS_SECRET_KEY"),
            "port": 443,
            "ttl": 300
        }
    }

    hmac_keys= [
        # {
        #     "access_key": os.getenv("LOCAL_ACCESS_KEY"),
        #     "secret_key": os.getenv("LOCAL_SECRET_KEY")
        # },
        {
            "access_key": os.getenv("LOCAL2_ACCESS_KEY"),
            "secret_key": os.getenv("LOCAL2_SECRET_KEY")
        },

    ]

    ra = ProxyServerConfig(
        cos_map=cos_map,
        bucket_creds_fetcher=do_hmac_creds,
        validator=do_validation,
        http_port=6190,
        # https_port=8443,
        threads=1,
        # verify=False,
        hmac_keystore=hmac_keys,
        skip_signature_validation=False,
        hmac_fetcher=lookup_secret_key
    )

    start_server(ra)


if __name__ == "__main__":
    main()

Run with aws-cli (but could be anything compatible with the aws s3 api like polars, spark, presto, ...):

$ aws s3 ls s3://proxy-bucket01/ --recursive --summarize --human-readable --profile osp
2025-04-17 17:45:30   33 Bytes README.md
2025-04-17 17:48:04   33 Bytes README2.md

Total Objects: 2
   Total Size: 66 Bytes
$

Server output:

$ uv run python test_server.py
2025-04-19T13:19:54.402023+02:00  INFO object_storage_proxy: Logger initialized; starting server on http port 6190 and https port 8443
2025-04-19T13:19:54.402361+02:00  INFO object_storage_proxy: Bucket creds fetcher provided: Py(0x100210680)
Fetching credentials for bucket01...
2025-04-19T13:19:54.402485+02:00  INFO object_storage_proxy: Callback returned: Kn2t...
[src/lib.rs:327:5] &run_args.cos_map = Py(
    0x000000010061aa00,
)
2025-04-19T13:19:54.403738+02:00  INFO pingora_core::server: Bootstrap starting
2025-04-19T13:19:54.403852+02:00  INFO pingora_core::server: Bootstrap done
2025-04-19T13:19:54.424489+02:00  INFO pingora_core::server: Server starting
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:19:58.124729+02:00  INFO object_storage_proxy::utils::validator: Callback returned: false
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:20:00.919320+02:00  INFO object_storage_proxy::utils::validator: Callback returned: true
2025-04-19T13:20:01.181775+02:00  INFO object_storage_proxy::credentials::secrets_proxy: No cached token found for proxy-bucket01, fetching ...
2025-04-19T13:20:01.181859+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetching bearer token for the API key
2025-04-19T13:20:01.739385+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Received access token
2025-04-19T13:20:01.739600+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetched new token for proxy-bucket01
2025-04-19T13:20:01.739668+02:00  INFO object_storage_proxy: Sending request to upstream: https://proxy-bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud/?list-type=2&prefix=&encoding-type=url
2025-04-19T13:20:01.739922+02:00  INFO object_storage_proxy: Request sent to upstream.

test

See the included python test script.

Create self-signed certificates and export the environment variables:

openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
        -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost"
export TLS_CERT_PATH=/full/path/cert.pem
export TLS_KEY_PATH=/full/path/key.pem

Status

  • pingora proxy implementation
  • pass in credentials handler (which may return either api key string or json string with access_key and secret_key )
  • cache credentials
  • pass in bucket/instance and bucket/port config
  • split in workspace crate with core, cli and python crates (too many specifics for python)
  • config mgmt
  • cache authorization (with optional ttl)
  • http frontend (optional)
  • https frontend (supports HTTP/2) (optional)
  • configurable request counting
  • call the api key fetcher callback only once, save to cos map
  • config for #threads in ProxyServerConfig
  • also pass path and method to python callbacks and cache by token/bucket/path/method (identity based access/cache)
  • option to disable upstream/peer certificate validation (for development, not production!)
  • expose pingora proxy server and services configuration to python
  • drop proxy headers (x-forwarded-proto, x-forwarded-host, ..) for signing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_storage_proxy-0.3.0.tar.gz (72.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

object_storage_proxy-0.3.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.3.0-cp313-cp313-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

object_storage_proxy-0.3.0-cp313-cp313-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

object_storage_proxy-0.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.3.0-cp312-cp312-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

object_storage_proxy-0.3.0-cp312-cp312-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

object_storage_proxy-0.3.0-cp311-cp311-musllinux_1_2_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

object_storage_proxy-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.3.0-cp311-cp311-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

object_storage_proxy-0.3.0-cp311-cp311-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

object_storage_proxy-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file object_storage_proxy-0.3.0.tar.gz.

File metadata

  • Download URL: object_storage_proxy-0.3.0.tar.gz
  • Upload date:
  • Size: 72.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.3

File hashes

Hashes for object_storage_proxy-0.3.0.tar.gz
Algorithm Hash digest
SHA256 a23c64dbde3394ff4f4a951b37adc6a9e76dac57355065242be85c05223d4e18
MD5 bcdd6273d8fa938bdc6e83dcaad38c86
BLAKE2b-256 fdf3c9e3d2731424a062e0558fc2f1488dedac7f6b76d5408a3e52b0c1e6b08c

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 05320dfa238e5368136fa1289c596accf315787524c746270ac1a6c5a3e73e04
MD5 18243b1279227b9718ea459b4fc8561b
BLAKE2b-256 3fa55f4e13c541cdd9322cdf334c7c2e1c0380414a58c092afc971043f19318c

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3fcda26016e5b6b69624dce00389cb99bbc45987e3c3c02ccbf32465b23bc530
MD5 570c3e6faea26c0d4aaac1944ee743d2
BLAKE2b-256 4d7c0acc5d1ce54babb353099f3f23e616770bd738752f1d03ef698fff4b85b3

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7edd772b3dece196cffd457c61001b0c3a8bea8196fc8f18620052cc111c9f67
MD5 3e14454fcbc7fd5a91add31b2a4792d5
BLAKE2b-256 730d629ec89ac66963ac70d00bb40316f8bd4ed7b31ead9027e5ff33aca386da

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 97a605063e4d1a9e7414f78afb1eb3a21276c18434cbfd04acd5a4fd5c78094e
MD5 2cdd641728b35300e3ad44d8097619c4
BLAKE2b-256 b9f4879705d9abe2031afa6ccca242e5faa1e85388bf9a187c4189642b645c2f

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 cf178ff318fc7e79c8151943cb6cb98abb616612805f0c2c9c95410badae10c8
MD5 abc19215947cfa38c882bc3833a23515
BLAKE2b-256 227ef26bfe75bf09ac8beee384579603eae4427aac4ccd4a24357a79ac8d718f

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 81a17d9c4b67f32facba37b7f27dd28acfe66fb7954bed6b076ac2b4f90bebd7
MD5 02f610eeb51e43a3cd3a3619c5653599
BLAKE2b-256 29bc27f00a0fcc300dcba4c4deb4749f7abcdf4d25bcca65b2496dba06d70662

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 62562a4487a32b73d4c7226591bea11366fe47302598873cb2e637625e9f189b
MD5 7faa5f1e55e5e82f23f2911414cfe90f
BLAKE2b-256 3fcfc33104554dee7d3d2f0343dca6ae6175449678e7fb3dd572468a151310e9

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 bd45c0ef6e20eb151f5d1cc38143756c7cf9f13df8b6c7ff59d5b59573d55e1e
MD5 80bc7aceda2b655f22919d82cbc74d6a
BLAKE2b-256 4a253a62fd5793b67126fd7ca06ad869753baca7aa76aea8cc3355165dd3abb7

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 9e2a62066394909c26f810cfbece044ea1d8c307455d910599a06baa0f0989ce
MD5 4aeb8a1c77acf171b2b687ed10e742c5
BLAKE2b-256 f5904b514d6e2f3b95982375d5e32c4bdc21445c9c0e58a739244afd4492cd15

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a1bfbd92a1a38e5c05a3c0a1ff41b1e5442e334f27ef715028e180a4e70a9e1a
MD5 56eb4e3b14c86a6f482f875ea034e316
BLAKE2b-256 9d0ebd5b8cc44572defc43f036970262cbe3768bbfcc4d53e1ee03d0caf9ecd4

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c859df44bba5cd26092970fbce319349b2af986013ef16eaa1340ff024415990
MD5 e90b19cf442e81e18c123fd586e9bc6a
BLAKE2b-256 2c825dba47a43c7dcc2bd877b44748dea80aea1af6989c1a46535ab36d8814f4

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 063672ec4eb7377d26951a15055fa60ae4c0f2e3c7386a2ca8ff4c29b44d8b11
MD5 4e9891a8d8e3b01196489bfc3f563db0
BLAKE2b-256 ff28cead0ea7229d00fde0964a3e0061c7f4fd33cc0373930640e6e3827e1230

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5af9742d692e32153b48182f75e30bd6daaa2293aa4f14a4f95a2a6d11d9559b
MD5 0726f611f89d4dd45feff06f11d1a21c
BLAKE2b-256 e721fa330fcfc56db0d0c84f74a698172e46261fcf0f07ece3cb58d5f356002e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page