Skip to main content

<object-storage-proxy ⚡> Yet Another Object Storage Proxy

Project description

CI PyPI - Downloads PyPI - version

<object-storage-proxy ⚡> Yet Another Object Storage Reverse Proxy

📌 Note: This project is still under heavy development, and its APIs are subject to change.

Introduction

A fast and safe in-process reverse proxy server, based on Cloudflare's pingora, to reverse proxy AWS and IBM Cloud Object Storage buckets and integrate your Authentication and Authorization services.

Decouples frontend from backend authentication and authorization.

  • Takes a Python authorization callable (allows you to plug in your own authorization services) and api_key fetch callback function and cos bucket dictionary.
  • The validation is cached with optional ttl (default 5min, keep it short).
  • The apikey is used to authenticate against IBM's IAM endpoint and is cached and renewed on expiration. (IBM only)
  • If no apikey or hmac keypair is provided, a Python function can be passed in to fetch the apikey or hmac keys for any given bucket (run once).
  • HMAC support: passing in access and secret id keys (or as json string from python credentials callable), will be used to sign the upstream request (AWS/IBM/..)
  • frontend request is signed with your own custom keypair
  • supports running in cloud environments / proxy headers for request signature validation

The bucket dict contains for each bucket:

- endpoint host
- port
- api key (optional)
- hmac access key (optional)
- hmac secret key (optional)
- ttl (optional, default 300) -> keep this reasonably short, but size to your needs
cos_map = {
    "bucket1": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "ttl": 0
    },
    "bucket2": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": "apikey"
    },
    "proxy-bucket01": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "access_key": "<redacted>",
        "secret_key": "<redacted>",
        "ttl": 300
    },
    "proxy-aws-bucket01": {
        "host": "s3.eu-west-3.amazonaws.com",
        "region": "eu-west-3",
        "access_key": os.getenv("AWS_ACCESS_KEY"),
        "secret_key": os.getenv("AWS_SECRET_KEY"),
        "port": 443,
        "ttl": 300
    }    
}

The Python callables take two arguments:

- token: parsed from the original aws request's authorization header
- bucket: parsed from the uri path
    def your_credentials_fetcher(token: str, bucket: str) -> str
    def your_request_authorizer(token: str, bucket: str) -> bool

Proxy Configuration

secrets

IBM COS Storage is built in a way where buckets are grouped by a cos (Cloud Object Storage) instance. Access to a bucket is managed by either an api key or hmac secrets, configured on the cos instance.

endpoint

Each bucket has its own endpoint: <bucket_name>.s3..cloud-object-storage.appdomain.cloud:.

The port is not always different, though, but it might be. Depends on your implementation.

You can imagine managing multiple buckets across instances can become quite cumbersome, even with aws profiles etc.

solution

There are two ways to access a bucket: through virtual addressing style (bucket.ibm-cos-host:port) and path style (ibm-cos-host/bucket).

your client (aws s3 compatible) -> http(s)://this-proxy/bucket01 -> https://bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud:443

  1. translate path style to virtual style
  2. abstract authentication & authorization

Pass in a function which maps bucket to instance (credentials), and a function to map bucket to port (endpoint)

request lifecycle

request stages

authentication & authorization

The advantage is we can plug in a python authentication function and another function for authorization, allowing for fine-grained control.

authentication

We use the standard aws hmac header and aws v4 request signing algorithm.

authorization

Pass in a callable from python which will be called from rust. This will be cached (ttl) for consecutive requests.

Examples

With local configuration.

~/.aws/config

[profile osp]
region = eu-west-3
output = json
services = osp-services
s3 =
    addressing_style = path

[services osp-services]
s3 =
  endpoint_url = http://localhost:6190

~/.aws/credentials

[osp]
aws_access_key_id = MYLOCAL123  # <-- this could be an openid connect/oauth2 token or anything that makes sense for your business, encode it if required
aws_secret_access_key = nothingmeaningful # <-- used for compatibility with aws sdk, to sign original request, but is ignored later

Set up a minimal server implementation:

import json
import os
import random
import object_storage_proxy as osp

from dotenv import load_dotenv

from object_storage_proxy import start_server, ProxyServerConfig


_TRUES  = {"y", "yes", "t", "true", "on", "1"}
_FALSES = {"n", "no", "f", "false", "off", "0"}


def strtobool(val: str) -> bool:
    """Convert a string to True/False, raise ValueError otherwise."""
    v = val.lower()
    if v in _TRUES:
        return True
    if v in _FALSES:
        return False
    raise ValueError(f"invalid truth value {val!r}")


def do_api_creds(token: str, bucket: str) -> str:
    """Fetch credentials (ro, rw, access_denied) for the given bucket, depending on the token. """
    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")
    
    print(f"Fetching credentials for {bucket}...")
    return apikey


def do_hmac_creds(token: str, bucket: str) -> str:
    """ Fetch HMAC credentials (ro, rw, access_denied) for the given bucket, depending on the token """
    access_key = os.getenv("ACCESS_KEY")
    secret_key = os.getenv("SECRET_KEY")
    if not access_key or not secret_key:
        raise ValueError("ACCESS_KEY or SECRET_KEY environment variable not set")
        
    print(f"Fetching HMAC credentials for {bucket}...")

    return json.dumps({
        "access_key": access_key,
        "secret_key": secret_key
    })

def lookup_secret_key(access_key: str) -> str | None:
    # get all environment variables ending in ACCESS_KEY
    access_keys = [{key:value} for key, value in os.environ.items() if key.endswith("ACCESS_KEY") and value==access_key ]

    if len(access_keys) > 0:
        access_key_var = next((k for k, v in access_keys[0].items() if v == access_key), None)

        secret_key_var = access_key_var.replace("ACCESS_KEY", "SECRET_KEY")
        return os.getenv(secret_key_var, None)
    else:
        print(f"no access keys found for : {access_key}")


def do_validation(token: str, bucket: str) -> bool:
    """ Authorize the request based on token for the given bucket. 
        You can plug in your own authorization service here.
        The token is the authorization token passed in the request.
        The bucket is the bucket name.
        The function should return True if the request is authorized, False otherwise.
    """

    print(f"PYTHON: Validating headers: {token} for {bucket}...")
    # return random.choice([True, False])
    return True


def main() -> None:
    load_dotenv()

    counting = strtobool(os.getenv("OSP_ENABLE_REQUEST_COUNTING", "false"))

    if counting:
        osp.enable_request_counting()
        print("Request counting enabled")

    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")

    cos_map = {
        "bucket1": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey,
            "ttl": 0
        },
        "bucket2": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey
        },
        "proxy-bucket01": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            # "access_key": os.getenv("ACCESS_KEY"),
            # "secret_key": os.getenv("SECRET_KEY"),
            "port": 443,
            "ttl": 300
        },
        "proxy-bucket05": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "access_key": os.getenv("PROXY_BUCKET05_ACCESS_KEY"),
            "secret_key": os.getenv("PROXY_BUCKET05_SECRET_KEY"),
            "port": 443,
            "ttl": 300
        },
        "proxy-aws-bucket01": {
            "host": "s3.eu-west-3.amazonaws.com",
            "region": "eu-west-3",
            "access_key": os.getenv("AWS_ACCESS_KEY"),
            "secret_key": os.getenv("AWS_SECRET_KEY"),
            "port": 443,
            "ttl": 300
        }
    }

    hmac_keys= [
        # {
        #     "access_key": os.getenv("LOCAL_ACCESS_KEY"),
        #     "secret_key": os.getenv("LOCAL_SECRET_KEY")
        # },
        {
            "access_key": os.getenv("LOCAL2_ACCESS_KEY"),
            "secret_key": os.getenv("LOCAL2_SECRET_KEY")
        },

    ]

    ra = ProxyServerConfig(
        cos_map=cos_map,
        bucket_creds_fetcher=do_hmac_creds,
        validator=do_validation,
        http_port=6190,
        # https_port=8443,
        threads=1,
        # verify=False,
        hmac_keystore=hmac_keys,
        skip_signature_validation=False,
        hmac_fetcher=lookup_secret_key
    )

    start_server(ra)


if __name__ == "__main__":
    main()

Run with aws-cli (but could be anything compatible with the aws s3 api like polars, spark, presto, ...):

$ aws s3 ls s3://proxy-bucket01/ --recursive --summarize --human-readable --profile osp
2025-04-17 17:45:30   33 Bytes README.md
2025-04-17 17:48:04   33 Bytes README2.md

Total Objects: 2
   Total Size: 66 Bytes
$

Server output:

$ uv run python test_server.py
2025-04-19T13:19:54.402023+02:00  INFO object_storage_proxy: Logger initialized; starting server on http port 6190 and https port 8443
2025-04-19T13:19:54.402361+02:00  INFO object_storage_proxy: Bucket creds fetcher provided: Py(0x100210680)
Fetching credentials for bucket01...
2025-04-19T13:19:54.402485+02:00  INFO object_storage_proxy: Callback returned: Kn2t...
[src/lib.rs:327:5] &run_args.cos_map = Py(
    0x000000010061aa00,
)
2025-04-19T13:19:54.403738+02:00  INFO pingora_core::server: Bootstrap starting
2025-04-19T13:19:54.403852+02:00  INFO pingora_core::server: Bootstrap done
2025-04-19T13:19:54.424489+02:00  INFO pingora_core::server: Server starting
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:19:58.124729+02:00  INFO object_storage_proxy::utils::validator: Callback returned: false
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:20:00.919320+02:00  INFO object_storage_proxy::utils::validator: Callback returned: true
2025-04-19T13:20:01.181775+02:00  INFO object_storage_proxy::credentials::secrets_proxy: No cached token found for proxy-bucket01, fetching ...
2025-04-19T13:20:01.181859+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetching bearer token for the API key
2025-04-19T13:20:01.739385+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Received access token
2025-04-19T13:20:01.739600+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetched new token for proxy-bucket01
2025-04-19T13:20:01.739668+02:00  INFO object_storage_proxy: Sending request to upstream: https://proxy-bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud/?list-type=2&prefix=&encoding-type=url
2025-04-19T13:20:01.739922+02:00  INFO object_storage_proxy: Request sent to upstream.

test

See the included python test script.

Create self-signed certificates and export the environment variables:

openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
        -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost"
export TLS_CERT_PATH=/full/path/cert.pem
export TLS_KEY_PATH=/full/path/key.pem

Status

  • pingora proxy implementation
  • pass in credentials handler (which may return either api key string or json string with access_key and secret_key )
  • cache credentials
  • pass in bucket/instance and bucket/port config
  • split in workspace crate with core, cli and python crates (too many specifics for python)
  • config mgmt
  • cache authorization (with optional ttl)
  • http frontend (optional)
  • https frontend (supports HTTP/2) (optional)
  • configurable request counting
  • call the api key fetcher callback only once, save to cos map
  • config for #threads in ProxyServerConfig
  • also pass path and method to python callbacks and cache by token/bucket/path/method (identity based access/cache)
  • option to disable upstream/peer certificate validation (for development, not production!)
  • expose pingora proxy server and services configuration to python
  • drop proxy headers (x-forwarded-proto, x-forwarded-host, ..) for signing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_storage_proxy-0.3.1.tar.gz (72.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

object_storage_proxy-0.3.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.3.1-cp313-cp313-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

object_storage_proxy-0.3.1-cp313-cp313-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

object_storage_proxy-0.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.3.1-cp312-cp312-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

object_storage_proxy-0.3.1-cp312-cp312-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

object_storage_proxy-0.3.1-cp311-cp311-musllinux_1_2_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

object_storage_proxy-0.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.3.1-cp311-cp311-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

object_storage_proxy-0.3.1-cp311-cp311-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

object_storage_proxy-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file object_storage_proxy-0.3.1.tar.gz.

File metadata

  • Download URL: object_storage_proxy-0.3.1.tar.gz
  • Upload date:
  • Size: 72.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.3

File hashes

Hashes for object_storage_proxy-0.3.1.tar.gz
Algorithm Hash digest
SHA256 18fbbfdac44af6f8b9a00bd37af54041914a8ae865f061adeb6e722f31ada5e4
MD5 5f29b2d61c057461d21438e92a1a208c
BLAKE2b-256 2fe49dd7128e4365eeea65d3e69b1ad2afde33e98bc1bdfc8af4394a8d22a7a8

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2882e57d3819ee328b19317bea481fe9f84e136b7595635e8cacd800098335fa
MD5 40dc66fcd2296e1d7978b29c8629f281
BLAKE2b-256 e570d9d55d4406055d07d617483d1d5a8de127007c6685f25bb374fb72b7ef1e

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7f132f576c8b7b3b376ddc20e58f100fc5c4615e65c8027adc935e8095594831
MD5 8e73ba20831c429efa5f3acfd73b94ab
BLAKE2b-256 e62b2022103c8ebb3ba07a6b2b3a8fa1d495e1361687c0983b7a9afbf8094db7

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e4daced2cb4e3ab8f7d0c0437fd287d7e4537820b0ed748ce2049c8ff817bd96
MD5 aa1ca271b1d9c3b1f7c0cb7a70fdd5b4
BLAKE2b-256 7dd2039b0db30115947d6d251254b243f43dab77871aaee36bf18714256f0cf2

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 237282ff8c320d27650ce7f3d62a0c3070ee4e03ab19743c1d1d51c9e5d594a9
MD5 7bf44319ebe5721514c037655833fa36
BLAKE2b-256 1d77ace1502186474a8655d0d92dec6b065b3ac051d6d6c58dcf13ab0c3772ef

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 88f8826bd19400ffcb82bccfa227d8680e30c304befa2fb8a2e9fe7e6e657f23
MD5 31582f2138e368853b434c8a4a6b4f9e
BLAKE2b-256 919af493297803ea6223a51088f8a0a0ef78b73e7dbc46bea39d5ca49b724cbe

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bf66977ecbdd7d9180433181c0f981ab8b75c61d99fa94e177f9751bc3eab2bb
MD5 72cfb8fdb9bc75c4ded6e36029f5eb5b
BLAKE2b-256 37957232884659c1f46df29773734ff7c08718f9b2a8937969ca9bd96fc8e9d9

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 17ca2ae809b895d4886abbbcc194795997378edcdc69f421671afe427d36d1ad
MD5 e54c1c6c74bcf9f2181250851955d114
BLAKE2b-256 de828d8d19f731aa48d9d089cdccaebe53704f2951b49c3959fc64023a124fc3

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ad040564dbebb9017b109490e3b148bfab5afab58754bd3bed51e7f94694aa1e
MD5 2762f0957072484665a419f5eb0bbd31
BLAKE2b-256 770485a27637175afc8754a35557d29fceee5fbe8b2bc3631cfbb805630a6a24

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 cec7d8746859bbcfcec158472ab8a45432d6711955a01499455b0ba26e378c32
MD5 7fb54bf349aedeef30bb2d22eb427b5f
BLAKE2b-256 6fa7fae8bdf899475f93398542a61d1637c304216d7cc67048d5b9f9c6e7d832

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5764f5234389985ce8b60d944cb20c48531f64942839b1cf7fdd05e4db7317be
MD5 1c7383a42d1e0ffba0db3d2366458805
BLAKE2b-256 e9d20cdaf493bf54d9e5cae4841db90f19e97801d3e22e5fd04bdcfd1dffc59a

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f0c233ba6c6c5a7ea0b60d9a7ed2982459a543c93e2aa4e047e5322052ebfeea
MD5 998641ba6880e8f5e7d9d975cd8e27f0
BLAKE2b-256 725d3b663b2e98e0a69e047d1db7d79abece7ea751c51b7551f11f4e6d434597

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a69b6c5d4f037d728d3f249d42e45bad0b449a5fcebdc6df3978e3b3fd1e7248
MD5 e73daf1f57bc0a3cd4ef38162dddbe7b
BLAKE2b-256 4c40df3dc4d47d71d1f30157e66235e57b39495a2322d9bd09bcd00d65f36e65

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3eec02269331111ad63d677732dfa78bde22354c5bb35e32fef8253c331f0caa
MD5 15835d445f5cb802722568de1f338015
BLAKE2b-256 c66337de31fdece05420ab8b23c46b802655a7ca2990336f651a805ae1790c42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page