Skip to main content

<object-storage-proxy ⚡> Yet Another Object Storage Proxy

Project description

CI PyPI - Downloads PyPI - version

<object-storage-proxy ⚡> Yet Another Object Storage Reverse Proxy

📌 Note: This project is still under heavy development, and its APIs are subject to change.

Introduction

A fast and safe in-process reverse proxy server, based on Cloudflare's pingora, to reverse proxy AWS and IBM Cloud Object Storage buckets and integrate your Authentication and Authorization services.

  • Takes a Python authorization callable function (allows you to plug in your own authorization services) and api_key fetch callback function and cos bucket dictionary.
  • The validation is cached with optional ttl (default 5min, keep it short).
  • The apikey is used to authenticate against IBM's IAM endpoint and is cached and renewed on expiration. (IBM only)
  • If no apikey is provided, a Python function can be passed in to fetch the apikey or hmac keys for any given bucket (run once).
  • HMAC support: passing in access and secret id keys (or as json string from python credentials callable), will be used to sign the request (AWS/IBM/..)

The bucket dict contains for each bucket:

- endpoint host
- port
- api key (optional)
- hmac access key (optional)
- hmac secret key (optional)
- ttl (optional, default 300) -> keep this reasonably short, but size to your needs
cos_map = {
    "bucket1": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "ttl": 0
    },
    "bucket2": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": "apikey"
    },
    "proxy-bucket01": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "access_key": "<redacted>",
        "secret_key": "<redacted>",
        "ttl": 300
    },
    "proxy-aws-bucket01": {
        "host": "s3.eu-west-3.amazonaws.com",
        "region": "eu-west-3",
        "access_key": os.getenv("AWS_ACCESS_KEY"),
        "secret_key": os.getenv("AWS_SECRET_KEY"),
        "port": 443,
        "ttl": 300
    }    
}

The Python callables take two arguments:

- token: parsed from the original aws request's authorization header
- bucket: parsed from the uri path
    def your_credentials_fetcher(token: str, bucket: str) -> str
    def your_request_authorizer(token: str, bucket: str) -> bool

secrets

IBM COS Storage is built in a way where buckets are grouped by a cos (Cloud Object Storage) instance. Access to a bucket is managed by either an api key or hmac secrets, configured on the cos instance.

endpoint

Each bucket has its own endpoint: <bucket_name>.s3..cloud-object-storage.appdomain.cloud:.

The port is not always different, though, but it might be. Depends on your implementation.

You can imagine managing multiple buckets across instances can become quite cumbersome, even with aws profiles etc.

solution

There are two ways to access a bucket: through virtual addressing style (bucket.ibm-cos-host:port) and path style (ibm-cos-host/bucket).

your client (aws s3 compatible) -> http(s)://this-proxy/bucket01 -> https://bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud:443

  1. translate path style to virtual style
  2. abstract authentication & authorization

Pass in a function which maps bucket to instance (credentials), and a function to map bucket to port (endpoint)

request lifecycle

request stages

authentication & authorization

The advantage is we can plug in a python authentication function and another function for authorization, allowing for fine-grained control.

authentication

We use the standard aws hmac header.

authorization

Pass in a callable from python which will be called from rust. This will be cached (ttl) for consecutive requests.

Examples

With local configuration.

~/.aws/config

[profile osp]
region = eu-west-3
output = json
services = osp-services
s3 =
    addressing_style = path

[services osp-services]
s3 =
  endpoint_url = http://localhost:6190

~/.aws/credentials

[osp]
aws_access_key_id = MYLOCAL123  # <-- this could be an openid connect/oauth2 token or anything that makes sense for your business, encode it if required
aws_secret_access_key = nothingmeaningful # <-- used for compatibility with aws sdk, to sign original request, but is ignored later

Set up a minimal server implementation:

import json
import os
import random
import object_storage_proxy as osp

from dotenv import load_dotenv

from object_storage_proxy import start_server, ProxyServerConfig


_TRUES  = {"y", "yes", "t", "true", "on", "1"}
_FALSES = {"n", "no", "f", "false", "off", "0"}


def strtobool(val: str) -> bool:
    """Convert a string to True/False, raise ValueError otherwise."""
    v = val.lower()
    if v in _TRUES:
        return True
    if v in _FALSES:
        return False
    raise ValueError(f"invalid truth value {val!r}")


def do_api_creds(bucket) -> str:
    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")
    
    print(f"Fetching credentials for {bucket}...")
    return apikey


def do_hmac_creds(bucket) -> str:
    access_key = os.getenv("ACCESS_KEY")
    secret_key = os.getenv("SECRET_KEY")
    if not access_key or not secret_key:
        raise ValueError("ACCESS_KEY or SECRET_KEY environment variable not set")
    print(f"Fetching HMAC credentials for {bucket}...")

    return json.dumps({
        "access_key": access_key,
        "secret_key": secret_key
    })


def do_validation(token: str, bucket: str) -> bool:
    print(f"PYTHON: Validating headers: {token} for {bucket}...")
    # return random.choice([True, False])
    return True


def main() -> None:
    load_dotenv()

    counting = strtobool(os.getenv("OSP_ENABLE_REQUEST_COUNTING", "false"))

    if counting:
        osp.enable_request_counting()
        print("Request counting enabled")


    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")

    cos_map = {
        "bucket1": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey,
            "ttl": 0
        },
        "bucket2": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey
        },
        "proxy-bucket01": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            # "access_key": os.getenv("ACCESS_KEY"),
            # "secret_key": os.getenv("SECRET_KEY"),
            "port": 443,
            "ttl": 300
        }
    }

    ra = ProxyServerConfig(
        cos_map=cos_map,
        bucket_creds_fetcher=do_hmac_creds,  # or: do_api_creds
        validator=do_validation,
        http_port=6190,
        https_port=8443,
        threads=1,
    )

    start_server(ra)


if __name__ == "__main__":
    main()

Run with aws-cli (but could be anything compatible with the aws s3 api like polars, spark, presto, ...):

$ aws s3 ls s3://proxy-bucket01/ --recursive --summarize --human-readable --profile osp
2025-04-17 17:45:30   33 Bytes README.md
2025-04-17 17:48:04   33 Bytes README2.md

Total Objects: 2
   Total Size: 66 Bytes
$

Server output:

$ uv run python test_server.py
2025-04-19T13:19:54.402023+02:00  INFO object_storage_proxy: Logger initialized; starting server on http port 6190 and https port 8443
2025-04-19T13:19:54.402361+02:00  INFO object_storage_proxy: Bucket creds fetcher provided: Py(0x100210680)
Fetching credentials for bucket01...
2025-04-19T13:19:54.402485+02:00  INFO object_storage_proxy: Callback returned: Kn2t...
[src/lib.rs:327:5] &run_args.cos_map = Py(
    0x000000010061aa00,
)
2025-04-19T13:19:54.403738+02:00  INFO pingora_core::server: Bootstrap starting
2025-04-19T13:19:54.403852+02:00  INFO pingora_core::server: Bootstrap done
2025-04-19T13:19:54.424489+02:00  INFO pingora_core::server: Server starting
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:19:58.124729+02:00  INFO object_storage_proxy::utils::validator: Callback returned: false
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:20:00.919320+02:00  INFO object_storage_proxy::utils::validator: Callback returned: true
2025-04-19T13:20:01.181775+02:00  INFO object_storage_proxy::credentials::secrets_proxy: No cached token found for proxy-bucket01, fetching ...
2025-04-19T13:20:01.181859+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetching bearer token for the API key
2025-04-19T13:20:01.739385+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Received access token
2025-04-19T13:20:01.739600+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetched new token for proxy-bucket01
2025-04-19T13:20:01.739668+02:00  INFO object_storage_proxy: Sending request to upstream: https://proxy-bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud/?list-type=2&prefix=&encoding-type=url
2025-04-19T13:20:01.739922+02:00  INFO object_storage_proxy: Request sent to upstream.

test

See the included python test script.

Create self-signed certificates and export the environment variables:

openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
        -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost"
export TLS_CERT_PATH=/full/path/cert.pem
export TLS_KEY_PATH=/full/path/key.pem

Status

  • pingora proxy implementation
  • pass in credentials handler (which may return either api key string or json string with access_key and secret_key )
  • cache credentials
  • pass in bucket/instance and bucket/port config
  • split in workspace crate with core, cli and python crates (too many specifics for python)
  • config mgmt
  • cache authorization (with optional ttl)
  • http frontend (optional)
  • https frontend (supports HTTP/2) (optional)
  • configurable request counting
  • call the api key fetcher callback only once, save to cos map
  • config for #threads in ProxyServerConfig
  • also pass path and method to python callbacks and cache by token/bucket/path/method (identity based access/cache)
  • option to disable upstream/peer certificate validation (for development, not production!)
  • expose pingora proxy server and services configuration to python
  • drop proxy headers (x-forwarded-proto, x-forwarded-host, ..) for signing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_storage_proxy-0.2.12.tar.gz (63.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

object_storage_proxy-0.2.12-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.12-cp313-cp313-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

object_storage_proxy-0.2.12-cp313-cp313-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

object_storage_proxy-0.2.12-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.12-cp312-cp312-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

object_storage_proxy-0.2.12-cp312-cp312-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

object_storage_proxy-0.2.12-cp311-cp311-musllinux_1_2_x86_64.whl (7.5 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

object_storage_proxy-0.2.12-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.12-cp311-cp311-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

object_storage_proxy-0.2.12-cp311-cp311-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

object_storage_proxy-0.2.12-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file object_storage_proxy-0.2.12.tar.gz.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12.tar.gz
Algorithm Hash digest
SHA256 9e25021e50d023fe4914ef3e5ea5ff5ce6963ae22bd3bbbac05ad4d49d30eb12
MD5 a42a21817713b108ea07d58d9c2beadf
BLAKE2b-256 da23513f05d54beee66a0e7eee0343ddd83922f837c498c865318af43d50e3c4

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6def7297e4db99da0bb73548519871f565775b89adec7ea5075aaa6621a32906
MD5 4e691c2651e6bde301d161442d53056b
BLAKE2b-256 71b67898cdda8142f103333f81df281741956e74615909bde07cc3dd4115575a

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7b58af61bf579e4a9da9eab91b9445371eaeaab79fcf689bfc44d9bd985014a6
MD5 92c73f9e6414b858d440d21645bf1685
BLAKE2b-256 5538e38ffed7eee91a7aad18e995bf4e6b6fd5e333237d65d5ff9168afaac12c

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b35fe08835e53c53968d4569446ab963dc4b2517ad7d4eefc2adff8f375d6a23
MD5 67cad6db3d96792f27a8433e62beacba
BLAKE2b-256 64c668ad8d9bddeed24e78d0880aeada533d9cd76c3ce3f5e6af4def8426c0da

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2aba4ffc0de0f3d4580e84cb93987b0eb4448f77b8be4b8ea0a2d375942ab3b4
MD5 4797ab40d3f7f6b345d366db811bbe16
BLAKE2b-256 5d725e67c3df0a5a77e0a3304b6a0f41caf28628b5f8797ccf1470622a55a14a

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 40c6b9e7ba46e098236e8b6cdb4bf0509cf37868117d4a32ec4939c75f749756
MD5 91f8a2e414a0ab823fdb14973761dbd1
BLAKE2b-256 7b28ad6e171b4a04f95d266661ee30f9269270f0cfb6d9c1740d85cfaa82f8dc

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 698fc89e5b54f0dc11b2b2aef3ce91232351f5aaac4526c12ad741f3d4d27a9d
MD5 20137422c89dcbf187064352d68127a8
BLAKE2b-256 bf784e43017ccbd264dc650377aa40c2543d3b136c7db7543dcdf29bb70d394b

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8cc615c32b0766ecaf66b2d2f6912dbdb008c795830981ff168f895da2319a28
MD5 4cf0d3b64ad8515496e8f2608327a30d
BLAKE2b-256 6c73a5367ba9684e388cbe9a28d6928ed1fd6ed4f75fe3adac4ea3d2f6cbd60c

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 950137faeab5e4ace9e84be1d28c22b592fe4039641b7de3a585b3f3ad58fba7
MD5 0e57b9df78d9fdd2f602ec0f94457020
BLAKE2b-256 0daeb7dd7f3065ba3bcd758077115d25b68430a2563e7a9855ed561104adde9a

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 c6d372f35cb2ca1404d566d886e47fbd0dcd572cf3697dd9eea4fff38e4d0b3f
MD5 33b8d0a48f5579d1950ab46ca9825e80
BLAKE2b-256 c8c3e1551fccf999bcf34ff19bb075381825ffa3253ee0beaf631153598ffc73

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 21a0b6b1200b8f24f42d9aa284b1e6213ee7abe230abd3bfb314de46bd7762b9
MD5 374a983d0dac6a8d6ef0eccfe7ef292a
BLAKE2b-256 0e3b3fb9a9ce9849e450f7378c16f758d9725c4351b41bc3dde72fb69affd0b2

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a80617f7bf5c6f752cb2356654b2074783f104b619c287176df286bc7eea46d7
MD5 5f17d68e900a6881f739d9e560f6ce03
BLAKE2b-256 844884cc45c590f4826e88350a63250662862a05e05543b9783745e23c43d9a0

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 80bd3256be1eef14aaeb6bcf02a1a8209c0e0d2ba9d70d7616bc9fd30bf20901
MD5 255963b1fd29254820e9e0dfb1251d72
BLAKE2b-256 663248e0bbecd56227188ea439715f586a076c476476679ee9a9151b3083aefb

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.12-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.12-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a4ca0bdfd5b6fe3560b76372f0f5a6ae3431c626f02dfe401d6e54edd7313d68
MD5 608a6064939bd5dd0f61267f57e7a050
BLAKE2b-256 aa51599a5a3ee8a1f60df2b29ebd5430380e08eed393635b827506dc54037b49

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page