Skip to main content

<object-storage-proxy ⚡> Yet Another Object Storage Proxy

Project description

CI PyPI - Downloads PyPI - version

<object-storage-proxy ⚡> Yet Another Object Storage Reverse Proxy

📌 Note: This project is still under heavy development, and its APIs are subject to change.

Introduction

A fast and safe reverse proxy server, based on Cloudflare's pingora, to reverse proxy AWS and IBM Cloud Object Storage buckets and integrate your Authentication and Authorization services.

  • Takes a Python authorization callable function (allows you to plug in your own authorization services) and api_key fetch callback function and cos bucket dictionary.
  • The validation is cached with optional ttl (default 5min, keep it short).
  • The apikey is used to authenticate against IBM's IAM endpoint and is cached and renewed on expiration. (IBM only)
  • If no apikey is provided, a Python function can be passed in to fetch the apikey or hmac keys for any given bucket (run once).
  • HMAC support: passing in access and secret id keys (or as json string from python credentials callable), will be used to sign the request (AWS/IBM/..)

The bucket dict contains for each bucket:

- endpoint host
- port
- api key (optional)
- hmac access key (optional)
- hmac secret key (optional)
- ttl (optional, default 300) -> keep this reasonably short, but size to your needs
cos_map = {
    "bucket1": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "ttl": 0
    },
    "bucket2": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": "apikey"
    },
    "proxy-bucket01": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "access_key": "<redacted>",
        "secret_key": "<redacted>",
        "ttl": 300
    },
    "proxy-aws-bucket01": {
        "host": "s3.eu-west-3.amazonaws.com",
        "region": "eu-west-3",
        "access_key": os.getenv("AWS_ACCESS_KEY"),
        "secret_key": os.getenv("AWS_SECRET_KEY"),
        "port": 443,
        "ttl": 300
    }    
}

The Python callables take two arguments:

- token: parsed from the original aws request's authorization header
- bucket: parsed from the uri path
    def your_credentials_fetcher(token: str, bucket: str) -> str
    def your_request_authorizer(token: str, bucket: str) -> bool

secrets

IBM COS Storage is built in a way where buckets are grouped by a cos (Cloud Object Storage) instance. Access to a bucket is managed by either an api key or hmac secrets, configured on the cos instance.

endpoint

Each bucket has its own endpoint: <bucket_name>.s3..cloud-object-storage.appdomain.cloud:.

The port is not always different, though, but it might be. Depends on your implementation.

You can imagine managing multiple buckets across instances can become quite cumbersome, even with aws profiles etc.

solution

There are two ways to access a bucket: through virtual addressing style (bucket.ibm-cos-host:port) and path style (ibm-cos-host/bucket).

your client (aws s3 compatible) -> http(s)://this-proxy/bucket01 -> https://bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud:443

  1. translate path style to virtual style
  2. abstract authentication & authorization

Pass in a function which maps bucket to instance (credentials), and a function to map bucket to port (endpoint)

request lifecycle

request stages

authentication & authorization

The advantage is we can plug in a python authentication function and another function for authorization, allowing for fine-grained control.

authentication

We use the standard aws hmac header.

authorization

Pass in a callable from python which will be called from rust. This will be cached (ttl) for consecutive requests.

Examples

With local configuration.

~/.aws/config

[profile osp]
region = eu-west-3
output = json
services = osp-services
s3 =
    addressing_style = path

[services osp-services]
s3 =
  endpoint_url = http://localhost:6190

~/.aws/credentials

[osp]
aws_access_key_id = MYLOCAL123  # <-- this could be an openid connect/oauth2 token or anything that makes sense for your business, encode it if required
aws_secret_access_key = nothingmeaningful # <-- used for compatibility with aws sdk, to sign original request, but is ignored later

Set up a minimal server implementation:

import json
import os
import random
import object_storage_proxy as osp

from dotenv import load_dotenv

from object_storage_proxy import start_server, ProxyServerConfig


_TRUES  = {"y", "yes", "t", "true", "on", "1"}
_FALSES = {"n", "no", "f", "false", "off", "0"}


def strtobool(val: str) -> bool:
    """Convert a string to True/False, raise ValueError otherwise."""
    v = val.lower()
    if v in _TRUES:
        return True
    if v in _FALSES:
        return False
    raise ValueError(f"invalid truth value {val!r}")


def do_api_creds(bucket) -> str:
    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")
    
    print(f"Fetching credentials for {bucket}...")
    return apikey


def do_hmac_creds(bucket) -> str:
    access_key = os.getenv("ACCESS_KEY")
    secret_key = os.getenv("SECRET_KEY")
    if not access_key or not secret_key:
        raise ValueError("ACCESS_KEY or SECRET_KEY environment variable not set")
    print(f"Fetching HMAC credentials for {bucket}...")

    return json.dumps({
        "access_key": access_key,
        "secret_key": secret_key
    })


def do_validation(token: str, bucket: str) -> bool:
    print(f"PYTHON: Validating headers: {token} for {bucket}...")
    # return random.choice([True, False])
    return True


def main() -> None:
    load_dotenv()

    counting = strtobool(os.getenv("OSP_ENABLE_REQUEST_COUNTING", "false"))

    if counting:
        osp.enable_request_counting()
        print("Request counting enabled")


    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")

    cos_map = {
        "bucket1": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey,
            "ttl": 0
        },
        "bucket2": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey
        },
        "proxy-bucket01": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            # "access_key": os.getenv("ACCESS_KEY"),
            # "secret_key": os.getenv("SECRET_KEY"),
            "port": 443,
            "ttl": 300
        }
    }

    ra = ProxyServerConfig(
        cos_map=cos_map,
        bucket_creds_fetcher=do_hmac_creds,  # or: do_api_creds
        validator=do_validation,
        http_port=6190,
        https_port=8443,
        threads=1,
    )

    start_server(ra)


if __name__ == "__main__":
    main()

Run with aws-cli (but could be anything compatible with the aws s3 api like polars, spark, presto, ...):

$ aws s3 ls s3://proxy-bucket01/ --recursive --summarize --human-readable --profile osp
2025-04-17 17:45:30   33 Bytes README.md
2025-04-17 17:48:04   33 Bytes README2.md

Total Objects: 2
   Total Size: 66 Bytes
$

Server output:

$ uv run python test_server.py
2025-04-19T13:19:54.402023+02:00  INFO object_storage_proxy: Logger initialized; starting server on http port 6190 and https port 8443
2025-04-19T13:19:54.402361+02:00  INFO object_storage_proxy: Bucket creds fetcher provided: Py(0x100210680)
Fetching credentials for bucket01...
2025-04-19T13:19:54.402485+02:00  INFO object_storage_proxy: Callback returned: Kn2t...
[src/lib.rs:327:5] &run_args.cos_map = Py(
    0x000000010061aa00,
)
2025-04-19T13:19:54.403738+02:00  INFO pingora_core::server: Bootstrap starting
2025-04-19T13:19:54.403852+02:00  INFO pingora_core::server: Bootstrap done
2025-04-19T13:19:54.424489+02:00  INFO pingora_core::server: Server starting
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:19:58.124729+02:00  INFO object_storage_proxy::utils::validator: Callback returned: false
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:20:00.919320+02:00  INFO object_storage_proxy::utils::validator: Callback returned: true
2025-04-19T13:20:01.181775+02:00  INFO object_storage_proxy::credentials::secrets_proxy: No cached token found for proxy-bucket01, fetching ...
2025-04-19T13:20:01.181859+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetching bearer token for the API key
2025-04-19T13:20:01.739385+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Received access token
2025-04-19T13:20:01.739600+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetched new token for proxy-bucket01
2025-04-19T13:20:01.739668+02:00  INFO object_storage_proxy: Sending request to upstream: https://proxy-bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud/?list-type=2&prefix=&encoding-type=url
2025-04-19T13:20:01.739922+02:00  INFO object_storage_proxy: Request sent to upstream.

test

See the included python test script.

Create self-signed certificates and export the environment variables:

openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
        -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost"
export TLS_CERT_PATH=/full/path/cert.pem
export TLS_KEY_PATH=/full/path/key.pem

Status

  • pingora proxy implementation
  • pass in credentials handler (which may return either api key string or json string with access_key and secret_key )
  • cache credentials
  • pass in bucket/instance and bucket/port config
  • split in workspace crate with core, cli and python crates (too many specifics for python)
  • config mgmt
  • cache authorization (with optional ttl)
  • http frontend (optional)
  • https frontend (supports HTTP/2) (optional)
  • configurable request counting
  • call the api key fetcher callback only once, save to cos map
  • config for #threads in ProxyServerConfig
  • also pass path and method to python callbacks and cache by token/bucket/path/method (identity based access/cache)
  • option to disable upstream/peer certificate validation (for development, not production!)
  • expose proxy server and services configuration to python
  • option to drop proxy headers (x-forwarded-proto, x-forwarded-host, ..)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_storage_proxy-0.2.11.tar.gz (62.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

object_storage_proxy-0.2.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.11-cp313-cp313-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

object_storage_proxy-0.2.11-cp313-cp313-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

object_storage_proxy-0.2.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.11-cp312-cp312-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

object_storage_proxy-0.2.11-cp312-cp312-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

object_storage_proxy-0.2.11-cp311-cp311-musllinux_1_2_x86_64.whl (7.5 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

object_storage_proxy-0.2.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.11-cp311-cp311-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

object_storage_proxy-0.2.11-cp311-cp311-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

object_storage_proxy-0.2.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file object_storage_proxy-0.2.11.tar.gz.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11.tar.gz
Algorithm Hash digest
SHA256 c269736dd372ca9d704846d701b8418d2b9816b7ab2e5d2d2e3d150e203e346e
MD5 df40463d7bfcdb1478ec42af964f8bf5
BLAKE2b-256 ba291b0157fd88a9dd1ecf5c4797509f3f86a64777e90c05512f3cb4a61f61f8

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 04faa418b4c0cd3df6039a17a50fadb4fb4971c4fea4fb9d4cf4383b898d3d3b
MD5 b5e352743ffc3e6d73850e511b9c7dfc
BLAKE2b-256 9b6c2b9dac805364aa5ef21414dad59ea6588811682b939d4c3995b66938b0ea

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2fe4787558882a23135fe6b3eee2da810cb68b6feeeabf8855ac50299b552053
MD5 3f04d28bb7495f5863e34137722307af
BLAKE2b-256 5a9a033dbeadfe8561d1bee593a6bc1cf85f9c7b4baf7fc40b43f979b28cebdc

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f585edab0f75f9901687d26e382b9029190f16901f7633b57c2f11e163e7b2cf
MD5 4ce404fd580557fa0c9be5d645ccd2be
BLAKE2b-256 76d83843a7402c20a4312774f33fb651b98f891592e62fb06db25896e88fb5e4

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7cf6b26d0be1840cc2722f92b855287e68ca5a345cbcbc2f4b8ba307d07574a0
MD5 29567c430019e336db44461e43b9c3f5
BLAKE2b-256 8b0a0fc57c316bc962c468105d41d71346e2a180b475b5aae41d9317d909e907

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b50c31e758451faa1e509abfaeccc576dd34a86fd841ccbb6640358ca8439bbe
MD5 38c45da7423297a67412ae61c13b4d07
BLAKE2b-256 703182f07d8f3f15959cb158ff70aa7a42f167ab324a19c23c2605e5c8bd7d01

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 492aa5b350d4ab5a6bc0e6ab5ae5b8b04e5000d2795c5b4b6ef63e2129b8c335
MD5 4c58f6ae19d8e52d2b73897bf62c4ed8
BLAKE2b-256 a7e2de58b4e6e5850dd3a3b41961012646915badcf67f4bcbfbcb2bdb189a11b

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1af30a1e97de2a3e5e91dd471c4858f42884b5001bc5a6c23202b340fcd18b3a
MD5 5c923ea623cb7630065f5e88701492ce
BLAKE2b-256 2f97402df57153e5a114b957831f619b6ffd00d4bbc18e99a642bde0d9d300bc

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 356102c090944b750e44835dc8aed8fa2a79405026ee50923688a6446248f2a2
MD5 30236c71d9d59fdd3f5b097ed5e520f3
BLAKE2b-256 8642002b63b997254b31b12baaf3523af4750b07263842857e64fda1b639d8d8

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 c72a9223823ca5d360c24954daa38d989ade4129e364ee88ebd6a206267a36c1
MD5 7c5ffb631918babd94fba7816dc400f4
BLAKE2b-256 3bcb9301a052e8afd36891f5769fdabb6839cbe8a7797e27e02abb0bf21afb18

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7b8be4f9208af5b62c625cf3e1d1549b99e6b278fc112517f096cb7ce2ed2b56
MD5 82c96f81c819a2e925aad556406fc09f
BLAKE2b-256 5e1bbca49275bb1d19caccbca04c8d50c90d434e860a76b602cd8b23f50a5d75

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bc910e7c234c45a3f08b028767fa917272f622c5ce79eca0851f0a7ddaadc528
MD5 e07a6e1576a52602965c18e2751c94e5
BLAKE2b-256 628fe581255f3e94f931356db13118d6cb17aac4dc7a48ca61d095d3719a9085

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 50d063d21ecd1ff4bbeb9b20f3042aad057601134bbc32a6eb82253032056ab6
MD5 8015b1eb2a40f110209a7097bf247ea1
BLAKE2b-256 34fcac3835755a6fca1b45edf17bd5ec38f281d15ae1c0b7ae78bd979ca0a5e5

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5b884221334438f6828b6e51ca63c4b7d5d463974f1105ea23872ba05f8e3329
MD5 0e1ba9b3b4936bd4b26d92a16b8873a6
BLAKE2b-256 cc9f2e734e7c2939314c9aa5cf7ed391908c8ef36acfdf67d301e6ef543ebe53

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page