Skip to main content

<object-storage-proxy ⚡> Yet Another Object Storage Proxy

Project description

CI PyPI - Downloads PyPI - version

<object-storage-proxy ⚡> Yet Another Object Storage Reverse Proxy

📌 Note: This project is still under heavy development, and its APIs are subject to change.

Introduction

A fast and safe reverse proxy server, based on Cloudflare's pingora, to reverse proxy AWS and IBM Cloud Object Storage buckets.

  • Takes a Python authorization (allows you to plug in your own authorization services) and api_key fetch callback function and cos bucket dictionary.
  • The validation is cached with optional ttl (default 5min, keep it short).
  • The apikey is used to authenticate against IBM's IAM endpoint and is cached and renewed on expiration. (IBM only)
  • If no apikey is provided, a Python function can be passed in to fetch the apikey for any given bucket (run once).
  • HMAC support: passing in access and secret id keys (or as json string from python credentials callable), will be used to sign the request (AWS/IBM/..)

The bucket dict contains for each bucket:

- endpoint host
- port
- api key (optional)
- hmac access key (optional)
- hmac secret key (optional)
- ttl (optional, default 300) -> keep this reasonably short, but size to your needs
cos_map = {
    "bucket1": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "ttl": 0
    },
    "bucket2": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": "apikey"
    },
    "proxy-bucket01": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "access_key": "<redacted>",
        "secret_key": "<redacted>",
        "ttl": 300
    }
}

The Python callables take two arguments:

- token: parsed from the original aws request's authorization header
- bucket: parsed from the uri path
    def your_credentials_fetcher(token: str, bucket: str) -> str
    def your_authorizer(token:str, bucket: str) -> bool

secrets

IBM COS Storage is built in a way where buckets are grouped by a cos (Cloud Object Storage) instance. Access to a bucket is managed by either an api key or hmac secrets, configured on the cos instance.

endpoint

Each bucket has its own endpoint: <bucket_name>.s3..cloud-object-storage.appdomain.cloud:.

The port is not always different, though, but it might be. Depends on your implementation.

You can imagine managing multiple buckets across instances can become quite cumbersome, even with aws profiles etc.

solution

There are two ways to access a bucket: through virtual addressing style (bucket.ibm-cos-host:port) and path style (ibm-cos-host/bucket).

your client (aws s3 compatible) -> http(s)://this-proxy/bucket01 -> https://bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud:443

  1. translate path style to virtual style
  2. abstract authentication & authorization

Pass in a function which maps bucket to instance (credentials), and a function to map bucket to port (endpoint)

request lifecycle

authentication & authorization

The advantage is we can plug in a python authentication function and another function for authorization, allowing for fine-grained control.

authentication

We use the standard aws hmac header.

authorization

Pass in a callable from python which will be called from rust. This will be cached (ttl) for consequtive requests.

Examples

With local configuration.

~/.aws/config

[profile osp]
region = eu-west-3
output = json
services = pingora-services
s3 =
    addressing_style = path

[services osp-services]
s3 =
  endpoint_url = http://localhost:6190

~/.aws/credentials

[osp]
aws_access_key_id = MYLOCAL123
aws_secret_access_key = nothingmeaningful

Set up a minimal server implementation:

import json
import os
import random
import object_storage_proxy as osp

from dotenv import load_dotenv

from object_storage_proxy import start_server, ProxyServerConfig


_TRUES  = {"y", "yes", "t", "true", "on", "1"}
_FALSES = {"n", "no", "f", "false", "off", "0"}


def strtobool(val: str) -> bool:
    """Convert a string to True/False, raise ValueError otherwise."""
    v = val.lower()
    if v in _TRUES:
        return True
    if v in _FALSES:
        return False
    raise ValueError(f"invalid truth value {val!r}")


def do_api_creds(bucket) -> str:
    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")
    
    print(f"Fetching credentials for {bucket}...")
    return apikey


def do_hmac_creds(bucket) -> str:
    access_key = os.getenv("ACCESS_KEY")
    secret_key = os.getenv("SECRET_KEY")
    if not access_key or not secret_key:
        raise ValueError("ACCESS_KEY or SECRET_KEY environment variable not set")
    print(f"Fetching HMAC credentials for {bucket}...")

    return json.dumps({
        "access_key": access_key,
        "secret_key": secret_key
    })


def do_validation(token: str, bucket: str) -> bool:
    print(f"PYTHON: Validating headers: {token} for {bucket}...")
    # return random.choice([True, False])
    return True


def main() -> None:
    load_dotenv()

    counting = strtobool(os.getenv("OSP_ENABLE_REQUEST_COUNTING", "false"))

    if counting:
        osp.enable_request_counting()
        print("Request counting enabled")


    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")

    cos_map = {
        "bucket1": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey,
            "ttl": 0
        },
        "bucket2": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey
        },
        "proxy-bucket01": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            # "access_key": os.getenv("ACCESS_KEY"),
            # "secret_key": os.getenv("SECRET_KEY"),
            "port": 443,
            "ttl": 300
        }
    }

    ra = ProxyServerConfig(
        cos_map=cos_map,
        bucket_creds_fetcher=do_hmac_creds,  # or: do_api_creds
        validator=do_validation,
        http_port=6190,
        https_port=8443,
        threads=1,
    )

    start_server(ra)


if __name__ == "__main__":
    main()

Run with aws-cli (but could be anything compatible with the aws s3 api like polars, spark, presto, ...):

$ aws s3 ls s3://proxy-bucket01/ --recursive --summarize --human-readable --profile osp
2025-04-17 17:45:30   33 Bytes README.md
2025-04-17 17:48:04   33 Bytes README2.md

Total Objects: 2
   Total Size: 66 Bytes
$

Server output:

$ uv run python test_server.py
2025-04-19T13:19:54.402023+02:00  INFO object_storage_proxy: Logger initialized; starting server on http port 6190 and https port 8443
2025-04-19T13:19:54.402361+02:00  INFO object_storage_proxy: Bucket creds fetcher provided: Py(0x100210680)
Fetching credentials for bucket01...
2025-04-19T13:19:54.402485+02:00  INFO object_storage_proxy: Callback returned: Kn2t...
[src/lib.rs:327:5] &run_args.cos_map = Py(
    0x000000010061aa00,
)
2025-04-19T13:19:54.403738+02:00  INFO pingora_core::server: Bootstrap starting
2025-04-19T13:19:54.403852+02:00  INFO pingora_core::server: Bootstrap done
2025-04-19T13:19:54.424489+02:00  INFO pingora_core::server: Server starting
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:19:58.124729+02:00  INFO object_storage_proxy::utils::validator: Callback returned: false
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:20:00.919320+02:00  INFO object_storage_proxy::utils::validator: Callback returned: true
2025-04-19T13:20:01.181775+02:00  INFO object_storage_proxy::credentials::secrets_proxy: No cached token found for proxy-bucket01, fetching ...
2025-04-19T13:20:01.181859+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetching bearer token for the API key
2025-04-19T13:20:01.739385+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Received access token
2025-04-19T13:20:01.739600+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetched new token for proxy-bucket01
2025-04-19T13:20:01.739668+02:00  INFO object_storage_proxy: Sending request to upstream: https://proxy-bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud/?list-type=2&prefix=&encoding-type=url
2025-04-19T13:20:01.739922+02:00  INFO object_storage_proxy: Request sent to upstream.

test

See the included python test script.

Create self-signed certificates and export the environment variables:

openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
        -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost"
export TLS_CERT_PATH=/full/path/cert.pem
export TLS_KEY_PATH=/full/path/key.pem

Status

  • pingora proxy implementation
  • pass in credentials handler (which may return either api key string or json string with access_key and secret_key )
  • cache credentials
  • pass in bucket/instance and bucket/port config
  • split in workspace crate with core, cli and python crates (too many specifics for python)
  • config mgmt
  • cache authorization (with optional ttl)
  • http frontend
  • https frontend (supports HTTP/2)
  • configurable request counting
  • call the api key fetcher callback only once, save to cos map
  • config for #threads in ProxyServerConfig

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_storage_proxy-0.2.3.tar.gz (56.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

object_storage_proxy-0.2.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.3-cp313-cp313-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

object_storage_proxy-0.2.3-cp313-cp313-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

object_storage_proxy-0.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.3-cp312-cp312-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

object_storage_proxy-0.2.3-cp312-cp312-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

object_storage_proxy-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.3-cp311-cp311-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

object_storage_proxy-0.2.3-cp311-cp311-macosx_10_12_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

object_storage_proxy-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file object_storage_proxy-0.2.3.tar.gz.

File metadata

  • Download URL: object_storage_proxy-0.2.3.tar.gz
  • Upload date:
  • Size: 56.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.3

File hashes

Hashes for object_storage_proxy-0.2.3.tar.gz
Algorithm Hash digest
SHA256 d955e204afddd0aa5b2e11c84056b821a2effb2599d3f27e67221234380e12c7
MD5 83c5955ccb92d4eded80127c90bacdf8
BLAKE2b-256 8cd0cc0501fa87ffc65ff7c40aacef27cbc38770656ebb142a25030f05149a16

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fd181907631e79b6954299db724ae4c562da845198e9a1c0e0a71bca866f905e
MD5 380d5a92a0a5f281d72aca8e3586652d
BLAKE2b-256 8ca5f95c2e17b499c08d06556ffa80cd367297b5af6bce47a68fcca3b85b8022

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 620931298c14c349abc3471f99374810072e10fb925435c27af7b61f8f21d239
MD5 e6b687f030354fed04c3f7b077fcb9cf
BLAKE2b-256 9de39163be08023dba25db0b4f80dc81c5e571724129ab2c75b84935f580b578

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9169439f594e025bf2d93a35568bddde4e62b48ed44bb50984e5022900825e48
MD5 47d69736d19dd503c953f7af8d4d0595
BLAKE2b-256 a18628893c3349acda9df282a2a4fdb1940f237a67476d2c6336d5bfd9bdd677

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 035995f5daee16a6de32fb65eabab9a2287be09e8458ea091407b3e454d60b26
MD5 97a0c865b23d59c5cc6380b2d8fe387e
BLAKE2b-256 2aaf34949b8a08623ae23286576d6ce8fc9c777f6981486da519b1d2e7705ed9

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 fd945db5389effd4e5c8eb3f17bf2d0f1ad2c8b622a307a1bd818bf37912f58d
MD5 b7e8ecfd95da009f229d797fa51e0462
BLAKE2b-256 79e273e4c7a9c2af20070fd5078d463c3a7740fa76bc455263a687efcd09d276

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 891a7988954728694ab497871914f2cc97f3150269777026de539654d603e859
MD5 86cf8ac408c2db26a1378426e7fc350d
BLAKE2b-256 f0761082df0d11a28e6a5e44b708dff43c3d7fd5ce78ebc549edeb80613dea28

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4859f8ea426780547d9255979cadb0417afdb9a30a1b47f4349541cf2be91cb0
MD5 8fb9639d7498aa7936afbc42a8870d9a
BLAKE2b-256 9ddfab996ba15985ace4e9396536e1a0ef245d94a381081c601e9472b1a0357f

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b0b1cedad341a37eb50df996ea49215646551dd469b43171628d4d98dfd6697f
MD5 191837b2ca468a6a5580e90c57255da4
BLAKE2b-256 59042571f7b97ebf809589e956dbc670fee2691c50fe696add5d20cc8aeca405

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 747e8640215e0330baabb722bb837ce398c80c8a11f1beda1a5b110f031f525c
MD5 5431d50c2f50e1eb98790f9f674a7578
BLAKE2b-256 2ee292b26c5103c577f7868076fc08eddc8c7c02a77acf4840ea0d1a4751caf0

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4c6eb7f969dfbb78e1c92ece2aa907d2833e6bcb6738f5e434f26de88b233ee6
MD5 5507cbaafe66d98e8347ed1f9a6c61b1
BLAKE2b-256 cd3b1dd6cb4b1d26c106c7e64016d59fae863cc1e746a202420eaeb0c004c586

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e767e250cae73a4179cc18e35b065914323e14f029248b6d630d7e0a885271a4
MD5 4b86a8451165cf189f3155aac3ea558b
BLAKE2b-256 c2226487ae212c36d4bd1dbb8eedc0ac40e2ea062303e55ed09deb4acbe8281b

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d488dd51745c2538648a46d4696abebb43059ee71cd3b0fc71a53afa0f27e865
MD5 a4ceb0b9a6dde16e519dde666497754c
BLAKE2b-256 0ae6346c242f52fd36f42473f10f089b5c2bdbf01333f45018dc23019bf197fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page