Skip to main content

<object-storage-proxy ⚡> Yet Another Object Storage Proxy

Project description

CI PyPI - Downloads PyPI - version

<object-storage-proxy ⚡> Yet Another Object Storage Reverse Proxy

📌 Note: This project is still under heavy development, and its APIs are subject to change.

Introduction

A fast and safe reverse proxy server, based on Cloudflare's pingora, to reverse proxy AWS and IBM Cloud Object Storage buckets and integrate your Authentication and Authorization services.

  • Takes a Python authorization callable function (allows you to plug in your own authorization services) and api_key fetch callback function and cos bucket dictionary.
  • The validation is cached with optional ttl (default 5min, keep it short).
  • The apikey is used to authenticate against IBM's IAM endpoint and is cached and renewed on expiration. (IBM only)
  • If no apikey is provided, a Python function can be passed in to fetch the apikey or hmac keys for any given bucket (run once).
  • HMAC support: passing in access and secret id keys (or as json string from python credentials callable), will be used to sign the request (AWS/IBM/..)

The bucket dict contains for each bucket:

- endpoint host
- port
- api key (optional)
- hmac access key (optional)
- hmac secret key (optional)
- ttl (optional, default 300) -> keep this reasonably short, but size to your needs
cos_map = {
    "bucket1": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "ttl": 0
    },
    "bucket2": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": "apikey"
    },
    "proxy-bucket01": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "access_key": "<redacted>",
        "secret_key": "<redacted>",
        "ttl": 300
    },
    "proxy-aws-bucket01": {
        "host": "s3.eu-west-3.amazonaws.com",
        "region": "eu-west-3",
        "access_key": os.getenv("AWS_ACCESS_KEY"),
        "secret_key": os.getenv("AWS_SECRET_KEY"),
        "port": 443,
        "ttl": 300
    }    
}

The Python callables take two arguments:

- token: parsed from the original aws request's authorization header
- bucket: parsed from the uri path
    def your_credentials_fetcher(token: str, bucket: str) -> str
    def your_request_authorizer(token: str, bucket: str) -> bool

secrets

IBM COS Storage is built in a way where buckets are grouped by a cos (Cloud Object Storage) instance. Access to a bucket is managed by either an api key or hmac secrets, configured on the cos instance.

endpoint

Each bucket has its own endpoint: <bucket_name>.s3..cloud-object-storage.appdomain.cloud:.

The port is not always different, though, but it might be. Depends on your implementation.

You can imagine managing multiple buckets across instances can become quite cumbersome, even with aws profiles etc.

solution

There are two ways to access a bucket: through virtual addressing style (bucket.ibm-cos-host:port) and path style (ibm-cos-host/bucket).

your client (aws s3 compatible) -> http(s)://this-proxy/bucket01 -> https://bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud:443

  1. translate path style to virtual style
  2. abstract authentication & authorization

Pass in a function which maps bucket to instance (credentials), and a function to map bucket to port (endpoint)

request lifecycle

request stages

authentication & authorization

The advantage is we can plug in a python authentication function and another function for authorization, allowing for fine-grained control.

authentication

We use the standard aws hmac header.

authorization

Pass in a callable from python which will be called from rust. This will be cached (ttl) for consecutive requests.

Examples

With local configuration.

~/.aws/config

[profile osp]
region = eu-west-3
output = json
services = osp-services
s3 =
    addressing_style = path

[services osp-services]
s3 =
  endpoint_url = http://localhost:6190

~/.aws/credentials

[osp]
aws_access_key_id = MYLOCAL123  # <-- this could be an openid connect/oauth2 token or anything that makes sense for your business, encode it if required
aws_secret_access_key = nothingmeaningful # <-- used for compatibility with aws sdk, to sign original request, but is ignored later

Set up a minimal server implementation:

import json
import os
import random
import object_storage_proxy as osp

from dotenv import load_dotenv

from object_storage_proxy import start_server, ProxyServerConfig


_TRUES  = {"y", "yes", "t", "true", "on", "1"}
_FALSES = {"n", "no", "f", "false", "off", "0"}


def strtobool(val: str) -> bool:
    """Convert a string to True/False, raise ValueError otherwise."""
    v = val.lower()
    if v in _TRUES:
        return True
    if v in _FALSES:
        return False
    raise ValueError(f"invalid truth value {val!r}")


def do_api_creds(bucket) -> str:
    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")
    
    print(f"Fetching credentials for {bucket}...")
    return apikey


def do_hmac_creds(bucket) -> str:
    access_key = os.getenv("ACCESS_KEY")
    secret_key = os.getenv("SECRET_KEY")
    if not access_key or not secret_key:
        raise ValueError("ACCESS_KEY or SECRET_KEY environment variable not set")
    print(f"Fetching HMAC credentials for {bucket}...")

    return json.dumps({
        "access_key": access_key,
        "secret_key": secret_key
    })


def do_validation(token: str, bucket: str) -> bool:
    print(f"PYTHON: Validating headers: {token} for {bucket}...")
    # return random.choice([True, False])
    return True


def main() -> None:
    load_dotenv()

    counting = strtobool(os.getenv("OSP_ENABLE_REQUEST_COUNTING", "false"))

    if counting:
        osp.enable_request_counting()
        print("Request counting enabled")


    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")

    cos_map = {
        "bucket1": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey,
            "ttl": 0
        },
        "bucket2": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey
        },
        "proxy-bucket01": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            # "access_key": os.getenv("ACCESS_KEY"),
            # "secret_key": os.getenv("SECRET_KEY"),
            "port": 443,
            "ttl": 300
        }
    }

    ra = ProxyServerConfig(
        cos_map=cos_map,
        bucket_creds_fetcher=do_hmac_creds,  # or: do_api_creds
        validator=do_validation,
        http_port=6190,
        https_port=8443,
        threads=1,
    )

    start_server(ra)


if __name__ == "__main__":
    main()

Run with aws-cli (but could be anything compatible with the aws s3 api like polars, spark, presto, ...):

$ aws s3 ls s3://proxy-bucket01/ --recursive --summarize --human-readable --profile osp
2025-04-17 17:45:30   33 Bytes README.md
2025-04-17 17:48:04   33 Bytes README2.md

Total Objects: 2
   Total Size: 66 Bytes
$

Server output:

$ uv run python test_server.py
2025-04-19T13:19:54.402023+02:00  INFO object_storage_proxy: Logger initialized; starting server on http port 6190 and https port 8443
2025-04-19T13:19:54.402361+02:00  INFO object_storage_proxy: Bucket creds fetcher provided: Py(0x100210680)
Fetching credentials for bucket01...
2025-04-19T13:19:54.402485+02:00  INFO object_storage_proxy: Callback returned: Kn2t...
[src/lib.rs:327:5] &run_args.cos_map = Py(
    0x000000010061aa00,
)
2025-04-19T13:19:54.403738+02:00  INFO pingora_core::server: Bootstrap starting
2025-04-19T13:19:54.403852+02:00  INFO pingora_core::server: Bootstrap done
2025-04-19T13:19:54.424489+02:00  INFO pingora_core::server: Server starting
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:19:58.124729+02:00  INFO object_storage_proxy::utils::validator: Callback returned: false
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:20:00.919320+02:00  INFO object_storage_proxy::utils::validator: Callback returned: true
2025-04-19T13:20:01.181775+02:00  INFO object_storage_proxy::credentials::secrets_proxy: No cached token found for proxy-bucket01, fetching ...
2025-04-19T13:20:01.181859+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetching bearer token for the API key
2025-04-19T13:20:01.739385+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Received access token
2025-04-19T13:20:01.739600+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetched new token for proxy-bucket01
2025-04-19T13:20:01.739668+02:00  INFO object_storage_proxy: Sending request to upstream: https://proxy-bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud/?list-type=2&prefix=&encoding-type=url
2025-04-19T13:20:01.739922+02:00  INFO object_storage_proxy: Request sent to upstream.

test

See the included python test script.

Create self-signed certificates and export the environment variables:

openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
        -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost"
export TLS_CERT_PATH=/full/path/cert.pem
export TLS_KEY_PATH=/full/path/key.pem

Status

  • pingora proxy implementation
  • pass in credentials handler (which may return either api key string or json string with access_key and secret_key )
  • cache credentials
  • pass in bucket/instance and bucket/port config
  • split in workspace crate with core, cli and python crates (too many specifics for python)
  • config mgmt
  • cache authorization (with optional ttl)
  • http frontend (optional)
  • https frontend (supports HTTP/2) (optional)
  • configurable request counting
  • call the api key fetcher callback only once, save to cos map
  • config for #threads in ProxyServerConfig
  • also pass path and method to python callbacks and cache by token/bucket/path/method (identity based access/cache)
  • option to disable upstream/peer certificate validation (for development, not production!)
  • expose proxy server and services configuration to python
  • option to drop proxy headers (x-forwarded-proto, x-forwarded-host, ..)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_storage_proxy-0.2.10.tar.gz (62.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

object_storage_proxy-0.2.10-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.10-cp313-cp313-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

object_storage_proxy-0.2.10-cp313-cp313-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

object_storage_proxy-0.2.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.10-cp312-cp312-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

object_storage_proxy-0.2.10-cp312-cp312-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

object_storage_proxy-0.2.10-cp311-cp311-musllinux_1_2_x86_64.whl (7.5 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

object_storage_proxy-0.2.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.10-cp311-cp311-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

object_storage_proxy-0.2.10-cp311-cp311-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

object_storage_proxy-0.2.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file object_storage_proxy-0.2.10.tar.gz.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10.tar.gz
Algorithm Hash digest
SHA256 e5152d21bb1d533f5c4ffe0e6b7f4b0f557bce92087822bf73765ae57f8b852c
MD5 0f1762de3fd87f5ea2c8955110896ee3
BLAKE2b-256 26e4873c440f34957a04851e806c3b580b1235dc30f58456fb49c3d9d4279934

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 beea880f7398331fd33886ca646ef3f118dee4b40aba99dedcef6ce2ef829c38
MD5 ff3cbe65602834705090328b06fe4c33
BLAKE2b-256 18637d930375a0ba7c030aff44539c95c1df2c5aadab4c43fbbdf5ae3904cfad

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fafa050e5d33ebf2ff0f37c64b1d8accf81d1368e742a1e78d72cd5171a2bcea
MD5 a98c13ade3b6e25d9b2c7b710baf403b
BLAKE2b-256 97d1e34b9f343b93516838c9266ecfb6e11e136edbc4c940c293786fc42e95c1

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 35d4c7f8d8b6f5f82413a9bca3b6e134f89363fefe2822473f26c8c61d89e5c9
MD5 335d524105658db093164db860bf5003
BLAKE2b-256 c3eaa2e9d7dc3176e903438647492818fd19b844aab3170a3bc1b5185b5c02d9

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 66bc1bd9b338991e8bffe43dbf19d611c4ab39db9ebf5a0d639cd2088426bf36
MD5 1a93e80e017a844988052c27f10e3f44
BLAKE2b-256 4833b33f6d08114a68414f2a7b9ba5023568fe2dd45db9f6dfd2360c57096395

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ae1948a0bf52b9d1d7d0fa3a14d098014c0f2867126e31eb5544f1b51d2e3862
MD5 1eb44723faa703e533874d4bdcdba000
BLAKE2b-256 150be457975433076153ca9dd402dc0962382205965fcedbd3e9b849c0e8714f

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f9e860c6b31c1bb9ba6fc189375e2cc2f53ebf750e89266dfcf8857cf326e721
MD5 95d1d8ed357723c38824d76ccf94f603
BLAKE2b-256 6602f7d228112cfb2101c2275580e83d98a8a3fd5a6606a1b22ff793cd820b54

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 007d5682a52466cc28db8743d93e29a2758aa9ee939b800d1f65066debd54ed6
MD5 30b6f6aa86ac6a943c54c41b400f3454
BLAKE2b-256 35cfa4d58e82bbc7b0c4092cb1fb4f09aa5c27e1ff6ed385124e23e2bea05f80

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 991d07d3ffe046aab774f6b3624cd1b40ad29664b86ba8fff8800aede47cf690
MD5 6efd79f90517d93d67f084584769dbe6
BLAKE2b-256 a34feb8460f75b167dba1785ca359ced14ad121bdca57277a3a940f044d8e889

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1d2b6b066159a94db29f95d8626a79ca00b9201d6f0a956a5e2697dcbf4c72fd
MD5 029c8ccf69ae6a065f77b03cdfb5ddf6
BLAKE2b-256 3142f5a6e78e1808432dd8da546cd388117b010efcf19dfccf87f9bdbee468df

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 77a0b4ddec7a7f7af7ca9dd4472ddfa48d0f6854d3d2042e5db3b44dd054f1ab
MD5 923484ed87ee174aef3f8c449a568b84
BLAKE2b-256 9a143a43a6dceb277c5e9cf8dab5dc57899b0b1ff8ccfc48dbc6a464a0f26fbc

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d0eea8dffda841bb70f462dfbbf7271f7cd555f1fe6bf56bf4b21bdffcc68795
MD5 ec246559caf97bfc7551af34f29d0055
BLAKE2b-256 74bd752a41981d55b4c19d1a36425b2e2385893d992cc97bada79da2676b6aba

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 174c8261864b6f9769db689f242c149a5be34389fa247e2c0494bb7c0c7044c5
MD5 e09595c5c241dc77c799cf1747017c83
BLAKE2b-256 3dffec090c3d26062121175122ec2e7b65e9fc928ebd69c65f951b046edf0bca

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 52cd3ed5dac98fbe9819e7158217df3aa246c90df8d6e885a50a74d1769b39d0
MD5 10746cdee86097804ebece9a7ca56bad
BLAKE2b-256 a8482b3542946452e4699f40f35beb6302bfc71611ec802d446b0c3de8891605

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page