Skip to main content

<object-storage-proxy ⚡> Yet Another Object Storage Proxy

Project description

CI PyPI - Downloads PyPI - version

<object-storage-proxy ⚡> Yet Another Object Storage Reverse Proxy

📌 Note: This project is still under heavy development, and its APIs are subject to change.

Introduction

A fast and safe in-process reverse proxy server, based on Cloudflare's pingora, to reverse proxy AWS and IBM Cloud Object Storage buckets and integrate your Authentication and Authorization services.

  • Takes a Python authorization callable (allows you to plug in your own authorization services) and api_key fetch callback function and cos bucket dictionary.
  • The validation is cached with optional ttl (default 5min, keep it short).
  • The apikey is used to authenticate against IBM's IAM endpoint and is cached and renewed on expiration. (IBM only)
  • If no apikey is provided, a Python function can be passed in to fetch the apikey or hmac keys for any given bucket (run once).
  • HMAC support: passing in access and secret id keys (or as json string from python credentials callable), will be used to sign the request (AWS/IBM/..)

The bucket dict contains for each bucket:

- endpoint host
- port
- api key (optional)
- hmac access key (optional)
- hmac secret key (optional)
- ttl (optional, default 300) -> keep this reasonably short, but size to your needs
cos_map = {
    "bucket1": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "ttl": 0
    },
    "bucket2": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "apikey": "apikey"
    },
    "proxy-bucket01": {
        "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
        "port": 443,
        "access_key": "<redacted>",
        "secret_key": "<redacted>",
        "ttl": 300
    },
    "proxy-aws-bucket01": {
        "host": "s3.eu-west-3.amazonaws.com",
        "region": "eu-west-3",
        "access_key": os.getenv("AWS_ACCESS_KEY"),
        "secret_key": os.getenv("AWS_SECRET_KEY"),
        "port": 443,
        "ttl": 300
    }    
}

The Python callables take two arguments:

- token: parsed from the original aws request's authorization header
- bucket: parsed from the uri path
    def your_credentials_fetcher(token: str, bucket: str) -> str
    def your_request_authorizer(token: str, bucket: str) -> bool

secrets

IBM COS Storage is built in a way where buckets are grouped by a cos (Cloud Object Storage) instance. Access to a bucket is managed by either an api key or hmac secrets, configured on the cos instance.

endpoint

Each bucket has its own endpoint: <bucket_name>.s3..cloud-object-storage.appdomain.cloud:.

The port is not always different, though, but it might be. Depends on your implementation.

You can imagine managing multiple buckets across instances can become quite cumbersome, even with aws profiles etc.

solution

There are two ways to access a bucket: through virtual addressing style (bucket.ibm-cos-host:port) and path style (ibm-cos-host/bucket).

your client (aws s3 compatible) -> http(s)://this-proxy/bucket01 -> https://bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud:443

  1. translate path style to virtual style
  2. abstract authentication & authorization

Pass in a function which maps bucket to instance (credentials), and a function to map bucket to port (endpoint)

request lifecycle

request stages

authentication & authorization

The advantage is we can plug in a python authentication function and another function for authorization, allowing for fine-grained control.

authentication

We use the standard aws hmac header.

authorization

Pass in a callable from python which will be called from rust. This will be cached (ttl) for consecutive requests.

Examples

With local configuration.

~/.aws/config

[profile osp]
region = eu-west-3
output = json
services = osp-services
s3 =
    addressing_style = path

[services osp-services]
s3 =
  endpoint_url = http://localhost:6190

~/.aws/credentials

[osp]
aws_access_key_id = MYLOCAL123  # <-- this could be an openid connect/oauth2 token or anything that makes sense for your business, encode it if required
aws_secret_access_key = nothingmeaningful # <-- used for compatibility with aws sdk, to sign original request, but is ignored later

Set up a minimal server implementation:

import json
import os
import random
import object_storage_proxy as osp

from dotenv import load_dotenv

from object_storage_proxy import start_server, ProxyServerConfig


_TRUES  = {"y", "yes", "t", "true", "on", "1"}
_FALSES = {"n", "no", "f", "false", "off", "0"}


def strtobool(val: str) -> bool:
    """Convert a string to True/False, raise ValueError otherwise."""
    v = val.lower()
    if v in _TRUES:
        return True
    if v in _FALSES:
        return False
    raise ValueError(f"invalid truth value {val!r}")


def do_api_creds(bucket) -> str:
    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")
    
    print(f"Fetching credentials for {bucket}...")
    return apikey


def do_hmac_creds(bucket) -> str:
    access_key = os.getenv("ACCESS_KEY")
    secret_key = os.getenv("SECRET_KEY")
    if not access_key or not secret_key:
        raise ValueError("ACCESS_KEY or SECRET_KEY environment variable not set")
    print(f"Fetching HMAC credentials for {bucket}...")

    return json.dumps({
        "access_key": access_key,
        "secret_key": secret_key
    })


def do_validation(token: str, bucket: str) -> bool:
    print(f"PYTHON: Validating headers: {token} for {bucket}...")
    # return random.choice([True, False])
    return True


def main() -> None:
    load_dotenv()

    counting = strtobool(os.getenv("OSP_ENABLE_REQUEST_COUNTING", "false"))

    if counting:
        osp.enable_request_counting()
        print("Request counting enabled")


    apikey = os.getenv("COS_API_KEY")
    if not apikey:
        raise ValueError("COS_API_KEY environment variable not set")

    cos_map = {
        "bucket1": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey,
            "ttl": 0
        },
        "bucket2": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            "port": 443,
            "apikey": apikey
        },
        "proxy-bucket01": {
            "host": "s3.eu-de.cloud-object-storage.appdomain.cloud",
            "region": "eu-de",
            # "access_key": os.getenv("ACCESS_KEY"),
            # "secret_key": os.getenv("SECRET_KEY"),
            "port": 443,
            "ttl": 300
        }
    }

    ra = ProxyServerConfig(
        cos_map=cos_map,
        bucket_creds_fetcher=do_hmac_creds,  # or: do_api_creds
        validator=do_validation,
        http_port=6190,
        https_port=8443,
        threads=1,
    )

    start_server(ra)


if __name__ == "__main__":
    main()

Run with aws-cli (but could be anything compatible with the aws s3 api like polars, spark, presto, ...):

$ aws s3 ls s3://proxy-bucket01/ --recursive --summarize --human-readable --profile osp
2025-04-17 17:45:30   33 Bytes README.md
2025-04-17 17:48:04   33 Bytes README2.md

Total Objects: 2
   Total Size: 66 Bytes
$

Server output:

$ uv run python test_server.py
2025-04-19T13:19:54.402023+02:00  INFO object_storage_proxy: Logger initialized; starting server on http port 6190 and https port 8443
2025-04-19T13:19:54.402361+02:00  INFO object_storage_proxy: Bucket creds fetcher provided: Py(0x100210680)
Fetching credentials for bucket01...
2025-04-19T13:19:54.402485+02:00  INFO object_storage_proxy: Callback returned: Kn2t...
[src/lib.rs:327:5] &run_args.cos_map = Py(
    0x000000010061aa00,
)
2025-04-19T13:19:54.403738+02:00  INFO pingora_core::server: Bootstrap starting
2025-04-19T13:19:54.403852+02:00  INFO pingora_core::server: Bootstrap done
2025-04-19T13:19:54.424489+02:00  INFO pingora_core::server: Server starting
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:19:58.124729+02:00  INFO object_storage_proxy::utils::validator: Callback returned: false
PYTHON: Validating headers: MYLOCAL123 for proxy-bucket01...
2025-04-19T13:20:00.919320+02:00  INFO object_storage_proxy::utils::validator: Callback returned: true
2025-04-19T13:20:01.181775+02:00  INFO object_storage_proxy::credentials::secrets_proxy: No cached token found for proxy-bucket01, fetching ...
2025-04-19T13:20:01.181859+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetching bearer token for the API key
2025-04-19T13:20:01.739385+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Received access token
2025-04-19T13:20:01.739600+02:00  INFO object_storage_proxy::credentials::secrets_proxy: Fetched new token for proxy-bucket01
2025-04-19T13:20:01.739668+02:00  INFO object_storage_proxy: Sending request to upstream: https://proxy-bucket01.s3.eu-de.cloud-object-storage.appdomain.cloud/?list-type=2&prefix=&encoding-type=url
2025-04-19T13:20:01.739922+02:00  INFO object_storage_proxy: Request sent to upstream.

test

See the included python test script.

Create self-signed certificates and export the environment variables:

openssl req -x509 -newkey rsa:4096 -sha256 -nodes \
        -keyout key.pem -out cert.pem -days 365 -subj "/CN=localhost"
export TLS_CERT_PATH=/full/path/cert.pem
export TLS_KEY_PATH=/full/path/key.pem

Status

  • pingora proxy implementation
  • pass in credentials handler (which may return either api key string or json string with access_key and secret_key )
  • cache credentials
  • pass in bucket/instance and bucket/port config
  • split in workspace crate with core, cli and python crates (too many specifics for python)
  • config mgmt
  • cache authorization (with optional ttl)
  • http frontend (optional)
  • https frontend (supports HTTP/2) (optional)
  • configurable request counting
  • call the api key fetcher callback only once, save to cos map
  • config for #threads in ProxyServerConfig
  • also pass path and method to python callbacks and cache by token/bucket/path/method (identity based access/cache)
  • option to disable upstream/peer certificate validation (for development, not production!)
  • expose pingora proxy server and services configuration to python
  • drop proxy headers (x-forwarded-proto, x-forwarded-host, ..) for signing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

object_storage_proxy-0.2.15.tar.gz (133.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

object_storage_proxy-0.2.15-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.15-cp313-cp313-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

object_storage_proxy-0.2.15-cp313-cp313-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

object_storage_proxy-0.2.15-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.15-cp312-cp312-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

object_storage_proxy-0.2.15-cp312-cp312-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

object_storage_proxy-0.2.15-cp311-cp311-musllinux_1_2_x86_64.whl (7.5 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

object_storage_proxy-0.2.15-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

object_storage_proxy-0.2.15-cp311-cp311-macosx_11_0_arm64.whl (4.9 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

object_storage_proxy-0.2.15-cp311-cp311-macosx_10_12_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

object_storage_proxy-0.2.15-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file object_storage_proxy-0.2.15.tar.gz.

File metadata

  • Download URL: object_storage_proxy-0.2.15.tar.gz
  • Upload date:
  • Size: 133.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.3

File hashes

Hashes for object_storage_proxy-0.2.15.tar.gz
Algorithm Hash digest
SHA256 f8da64274fda88cc96219c235a0e2a7cc07c9a9624f4c7e4b747482460aca977
MD5 792c98c3e78f1c6bed78ae3309f5a250
BLAKE2b-256 6ed1fdd549250e31f153084af599f434203b1aa3d0f2a2fb74968cafbb449e65

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 783da848d9e4ebb683432e5d838a8eae5499c2b7ae9d9a88061b1565e0dda3ae
MD5 2559479beb7719b16157b88f78e463c7
BLAKE2b-256 f86614dad930b63a3b247f7fdcc353f88df08dcf1b6cf4d5566358d80c346a86

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7cb3c95d6e5b41cf2e04f4ae3bdf82bb4214c57fbc09bd942d82fb0279253b34
MD5 8ffa71b820405daa4403533bb82b6315
BLAKE2b-256 05d9b3ef8723ee186b82f158343f0bd522709f240d7e98206083aa5356dfc3e9

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f2368fed8499fa0e14954b710c2643ca69ea908363902d07e5e0f8f3a54f7b7a
MD5 8a206a8c03bfb3c7198982a88b8fd564
BLAKE2b-256 e0b5af61ff08dc26663dc6fb30ebc24adaeb9459332eabd80b64ecaf78c6b411

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 45a257173b012fa97c2302591d67cea3c7ce0fbdd08017d634865c76ad19893f
MD5 8f5b6b35e2b49dab605c9af69403af60
BLAKE2b-256 85aedf0fc96b56af1aa120a8ae890c70797e64e8a3736eb07a128b322cfff92e

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a10815d931aaae2c14333beabe1c0aad22403f173187be40b4dcc00d115f570d
MD5 3a94d28360211c2705c0bddf67ca3f49
BLAKE2b-256 f24d7475bf1c7a6e578fc069f137c2a294480fd837c44f0c43b310fc963fcc1d

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b5540476a3166018e3aea84cd2035ba31987f01da7f15a3a56f51acf9e1f53db
MD5 aaa1466caea967d165fc0d83787e9a62
BLAKE2b-256 587f8804b74217a0b205c0aef96499f5f1698a616bf3f4efc41e81acaaa06cef

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5af035024c80b51614e28956b01584068f3209dc6aab2a1a945c43a083d8b4b0
MD5 988b80dac3707cd35ce6adf2f02e0be4
BLAKE2b-256 f0ff89fd859d8f71cd6e79e8174fc3cadedd580a4421f429412518bb41eaf761

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 45c180b14d992ec2f3a2e8dd6edca360f1a7c91d96fcb2d3a1559d86270dd2cd
MD5 c92b807e11380bd6bfd37be31d0f8f6a
BLAKE2b-256 2127091052afecc557fa70b7891ec81508ba7e1842690d6fb158cad41874eb0f

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 f9973703a6705c6a4c8bae9d98467e6853324bcbf4d8afcbf0414f5fcac54392
MD5 4378e87080fbd4cf44d660dfe0a6598c
BLAKE2b-256 77539d71239eddac8643a8a7fd2cb126c13c5336393a4a57dc7f5dde9bc6f44e

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 00411e9ea74e38de1a314b43f3da2000f1667a47d13343a53ea685af12eacf01
MD5 27c00120077a662b7f09c0c8e06d7465
BLAKE2b-256 d4d152edf06e0242942b3b6e2b564cecff5f2c14563e20ac63849e611f2d505d

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 31e303f8563c92a8c9ed9bcbbca50636e55f08286865102f3e533f7a12520c1e
MD5 15833236dafa237c6c2b2c4e6a9edf44
BLAKE2b-256 940e1b60354cc36f14ca585969adec1420f1db50256a5bd75326ad962a5009ea

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 2a21307e43229d6b73779b340da9d92136efc5f6a14064dafd101e9fe21e3432
MD5 1e78bfe3969812058b707d03b7fb472e
BLAKE2b-256 e46177f7b20f144bcff29d00f5f8c051bfb9b8b98d319c6c7bc8e85739e9d24f

See more details on using hashes here.

File details

Details for the file object_storage_proxy-0.2.15-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for object_storage_proxy-0.2.15-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cbc64a31c032de432a0f69ccbd43ef997b680a8c456a8f6c8eac0f493a4611a1
MD5 0a7c19981c372eb724d5d8e4a7aadd76
BLAKE2b-256 14a4c0fb9bff69de54fdb095172822189eec1d7c3b6577784f7a00d661e14c7a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page